Home Phoronix Phoronix Forums X.Org Videos From FOSDEM 2008

Radeon IRC Logs For 2008-10-08

Search This Log:


spstarr: waits
airlied: spstarr: just remove the __user from the structs.
spstarr: no longer needing the denotion of that being 'userspace' not kernelspace?
spstarr: denotation
spstarr: yes it built,
spstarr: in the meantime im waiting for the kernel rpm to build so i can confirm that upstream needs to revert that
spstarr: or rework it
spstarr: thus not impacting radeon and thus closing that other bug :-)
airlied: the kernel produced headers remove __user.
airlied: I just need to update the radeon copy
spstarr: oh in git
spstarr: waits for kernel
telexicon: spstarr, nah i just meant, why does the kernel disable the IRQ when something bad happens?
telexicon: is there a way to prevent that so you can debug?
spstarr: because it doenst know what to do with the interrupt, mind you why doesnt it just send it somewhere and have the kernel issue a spurious interrupt
airlied: telexicon: once it gets 100,000 irqs that nobody cared about it disables it to stop messing up the machine.
telexicon: oh i see
spstarr: airlied: 100,000?
airlied: spstarr: I think thats the cutoff for unhandled irqs.
spstarr: thats a lot :)
airlied: spstarr: it might be radeon is getting an irq it doesn't handle.
airlied: you would need to dump some regs in the irq handler to see if that was the case
spstarr: i will find out shortly
spstarr: reverting that git diff will confirm if this is upstreams fault
airlied: TMM: hehe.. neither nha or MAD are in here at the moment :)
airlied: TMM: nha is in .eu timezone though so should be around eventually
TMM: so am I :)
TMM: on my way to work <3 3g
TMM: airlied: am I correct that the current vertex shader commands are being pushed one by one ? the code appears to loop over all the opcodes and just push them to the card in those r300TranslateOpcode* functions
airlied: TMM: pretty much.
airlied: they don't get pushed to the card in that funciton
airlied: it just sets up the array for pushing later.
TMM: ah, ok
airlied: it only gets sent to the kernel in r300_state.c:r300SetupVertexProgramFragment
airlied: well the state emission sends it to the kernel
TMM: so, basically now now optimizations are done at all on the input stream? it just gets directly translated, put in an array and pushed to the card then?
airlied: pretty much
TMM: ok
TMM: this will indeed be a very intersting thing to work on
airlied: nha has some code in radeon_nqssadce.c
TMM: 'not quite ssa' ghe
spstarr: rpms done.. almost
spstarr: geesh, rpmbuild is so slow
airlied: spstarr: rebuilding rpms is definitely not the way to do debuging.
spstarr: airlied: kernel
spstarr: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=de85422b94ddb23c021126815ea49414047c13dc;hp=2542335ccf34cfb442d3fd842d7e78ca5e649951 <-- removed
airlied: I don't think I ever locally rebuild an rpm kernel
spstarr: i have a patch and modified the .spec to add it in
spstarr: its almost done, its just slow generating the rpm itself
spstarr: Wrote: /root/rpmbuild/RPMS/i686/kernel-2.6.27-0.398.rc9.fc10.i686.rpm
spstarr: its the debuginfo ones that are slow to write
spstarr: done
MrEIso: so, we're all in suspense, does it work? :)
spstarr: hehe
airlied: woot suspend/resume without bonghits.
MrEIso: airlied, heh
spstarr: i jst need kernel-firmware from koji
TMM: airlied: don't expect a patch very soon, I think I'
TMM: airlied: I figured out what needs to be done now, and it won't be noon tomorrow :P
airlied: TMM: hehe.. no worries its a fairly big job :)
airlied: TMM: just talk to nha/glisse/MostAwesomeDude, they might have some ideas/help
airlied: TMM: the other options is the gallium idea about trying to use LLVM
TMM: that's actually a pretty cool idea
marcheu: airlied: actually it works for putting some VP on SSE now
marcheu: next I'll separate the backends
airlied: marcheu: TMM is just wondering abuot optimising radeon VPs for r500.
airlied: r400/r500 even.
marcheu: well for SSE it sures generates kickass code
airlied: &
TMM: airlied: I'll also need a firmer understanding of the GL vertex programs. as it turns out, I seem to miss some detailed knowledge
marcheu: for GPUs we'll need llvm hw description files
MrEIso: VP?
marcheu: vertex program
MrEIso: oh right, duh.
TMM: marcheu: so, llvm's code generation is 'kick ass' now? awesome :)
marcheu: well it is better than our own internal sse assembler
TMM: for mesa?
marcheu: yeah
TMM: ah
marcheu: granted there are still rough edges ATM... like it writes intermediate temps to memory because our IR description is poor
TMM: are you the nouvoux guy btw?
marcheu: no, but I'm the nouveau guy
TMM: my french sucks :)
spstarr: ok
spstarr: im in the new kernel + git DDX code
spstarr: if i run glxgears let's see if i get IRQ timeout
TMM: I meant to disrespect:)
marcheu: anyway the problem with radeon ATM is there's no gallium driver, and you need it to use the llvm framework
TMM: is gallium in a usable state yet?
marcheu: yeah
marcheu: there's still the occasional bug, but it's quite stable now. the issue is more writing the drivers :)
TMM: hmm, so... I can either implement the vp compiler from scratch, or port r300 to gallium and use llvm
marcheu: yeah no one from radeon dared making the jump yet :)
TMM: did nouveau move over? (spelling! :D)
marcheu: yeah, we never even finished the old-style dri driver
spstarr_home: airlied:
spstarr_home: kernel:[ 215.933980] Disabling IRQ #9
spstarr_home: do_wait: drmWaitVBlank returned -1, IRQs don't seem to be working correctly.
spstarr_home: Try adjusting the vblank_mode configuration parameter.
spstarr_home: Segmentation fault
spstarr_home: confirming my kernel really does NOT have the bad code still
TMM: marcheu: how's it coming with nouveau anyway? did nvidia ever bothered to help out in any way yet?
marcheu: things are going well, nvidia doesn't help
marcheu: currently I'd like to move the drivers to using llvm. that'll also allow support for older cards
spstarr_home: grrrrrrrrrr
spstarr_home: its still there
TMM: brb
spstarr_home: it fscking didnt revert the change!
spstarr_home: fuc
MrEIso: spstarr_home, doh :)
spstarr_home: but I followed the patch process
spstarr_home: but wait
spstarr_home: it did :)
spstarr_home: im just confused with the other directory it made
spstarr_home: vanilla-2.6.27-rc9 does not have my change
spstarr_home: but vanilla-2.6.26 does
MrEIso: ah right
spstarr_home: not sure if thats how kernel.spec does its patching
spstarr_home: since .27-rc9 has no binary objects
MrEIso: It's been nearly 10 years since I last seriously used redhat sorry :)
spstarr_home: airlied: ok so it's for you :)
spstarr_home: test two
spstarr_home: disable usb altogether 'nousb' in grub...
spstarr: ok no usb loaded trying glxgears
spstarr_home: [ 277.309375] handlers:
spstarr_home: [ 277.309380] [] (acpi_irq+0x0/0x28)
spstarr_home: [ 277.309393] [] (e1000_intr+0x0/0x13f [e1000])
spstarr_home: its not usb change
spstarr_home: updates bug.. looks like this one really is for you airlied ;(
MrEIso: spstarr_desk, hmm
MrEIso: spstarr_desk, what tree are you running?
spstarr_home: Linux segfault.sh0n.net 2.6.27-0.398.rc9.fc10.i686 #1 SMP Wed Oct 8 02:00:39 EDT 2008 i686 i686 i386 GNU/Linux
spstarr_home: airlied: http://bugzilla.kernel.org/show_bug.cgi?id=11700
TMM: gallium looks seriously awesome btw
TMM: I just read their wiki
spstarr_home: TMM: which wiki?
spstarr_home: its updated?
TMM: how would I know if it's updated? I just read it for the first time. The last time I read anything about gallium was in the 'this would be cool stage' of things
MrEIso: spstarr_desk, once upon a time, in the dark ages of x86's prehistory, IRQ 9 was the vertical retrace interrupt.
MrEIso: spstarr_desk, I wonder if somehow that's gotten re-enabled.
spstarr_home: 9 is where ACPI lives on ;)
spstarr_home: oddly
spstarr_home: why is my laptop using PIC
spstarr_home: and not IOAPIC
MrEIso: hmm
spstarr_home: i could try to force ioapic on
MrEIso: line 41 of drivers/gpu/drm/radeon/radeon_irq.c
MrEIso: is where it figures out what interrupts the card thinks are outstanding
spstarr_home: bug?
spstarr_home: looks at git
MrEIso: not as such
spstarr_home: i dont know if XT-PIC-XT is right
MrEIso: it looks like it only considers SW_INT_TEST_ACK, CRTC{,2}_VBLANK_STAT
spstarr_home: looks at the code
MrEIso: (line 75 now)
spstarr_home: are there other requests?
spstarr_home: yea
spstarr_home: readeon_acknowedge_irqs
spstarr_home: radeon
MrEIso: indeed
spstarr_home: the header might tell us
MrEIso: (I know nothing about radeon btw, I'm just following through the code looking for something suspicious -- I suspect if airlied was awake they'd tell us we're barking up completely the wrong tree :)
spstarr_home: heh
MostAwesomeDude: checks clock
MostAwesomeDude: airlied's probably awake, but there's more to his life than Radeons. I think.
spstarr_home: here there is
MrEIso: awake as in "around on IRC"
spstarr_home: RADEON_GUI_IDLE_INT_ENABLE
spstarr_home: I dont see RADEON_SW_INT_ENABLE
spstarr_home: there's a few, but the names dont all tell me they are for interrupt
spstarr_home: only the INT ones seem to tell me, VBLANK doesn't
MrEIso: hmm
MrEIso: here's athought
MrEIso: I hear rumours that there might be register documentation
MrEIso: we could (and here's the real shocker) read the docs :)
spstarr_home: sure
spstarr_home: this code is generic for all models
MrEIso: you have an r500?
spstarr_home: rv350 in this case
spstarr_home: does glxgears work for you?
spstarr_home: which kernel?
spstarr_home: you need to be in git kernels to see this
spstarr_home: but!
MrEIso: I don't have any radeon cards at the moment
spstarr_home: if we compare this code from the code in 2.6.26..
spstarr_home: we might see
spstarr_home: grabs tarball
spstarr_home: comparing the interrupt handling might shed light
spstarr_home: 2.6.26 had it under char/drm
MrCooper: you guys are probably barking up the wrong tree :) we're only enabling the GPU interrupts we're interested in
MrEIso: MrCooper, well yes, but presumably another interrupt is getting enabled somewhere along the lines
MrEIso: if the problem is as spstart suspects, that the card is producing an interrupt that the driver isn't expecting
spstarr_home: MrCooper: there's absolutely nothing changed
spstarr_home: MrEIso too
MrCooper: if the IRQ is shared with another device, it's more likely related to that
MrEIso: MrCooper, do you have any ideas for a better tree to bark up?
spstarr_home: MrCooper: im down to two things, on irq 9 there's only e1000 and acpi shared
spstarr_home: MrCooper: when shutting down laptop, irq 9 gets disabled by ACPI (and thus, itself)
spstarr_home: but, drm is loaded
spstarr_home: i need sleep
MrCooper: so is his Radeon using the same IRQ or not?
MrEIso: you'd really have to ask him
MrEIso: I think he's triggering it by running glxgears
MrEIso: but because he's using XTPIC mode, everything and it's brother is sharing the IRQ
airlied: it maybe enabling irqs somewhere wrong modesetting changes some of the irq handling.
airlied: I need to reconfirm nomodeset.
airlied: goes back to cooking dinner.
MrEIso: airlied, so saying what interrupts that the card thinks are pending might be interesting then.
hifi: ah, finally I got the correct resolution for this TFT, thanks devs!
buggs: j #lucene
spstarr_work: MrEIso: I dont know why it wants to use XTPIC
spstarr_work: thats old school
spstarr_work: MrCooper: I don't see the radeon in /proc/interrupts
spstarr_work: maybe because i can't when i attempt to enable using the device
spstarr_work: airlied: so you think it's due to kms and irq settings
airlied: spstarr_work: maybe.. if you use radeontool I think you can RADEON_GEN_INT_CNTL and see if something is enabled thatshouldn't be
spstarr_work: lemme note that
spstarr_work: airlied: after the IRQ is shut off?
spstarr_work: since i wont be able to trip it unless i enable composite or run a GL app
spstarr_work: only then will irq 9 be disabled (unless i unload radeon)
airlied: yeah the card should stilll have it enabled
spstarr_work: ok i will do that when I get home in an hour 40
spstarr_work: airlied: git DDX has no cut off issues (since kms stuff isn't in git master?)
spstarr_work: the 1/4 screen seems to be cut off, not properly drawing objects
airlied: spstarr_work: the cutoff, I reproduced it yesterday, fix it today time
spstarr_work: danke
airlied: finished suspned/resume yeterday hopefully.
spstarr_work: i will begin testing kms on the HD 3650 soon
spstarr_work: i still want to believe I can have 3D working again on the rv350
spstarr_work: i still have a debug X server with the DRI contexts being dumped log
spstarr_work: but if the irq issue can be solved, that will help me get back to testing the race issue
TMM: airlied: I have reviewed all my options, and I think that going the galladium route is the path of least resistance. Any thoughts?
airlied: TMM: its probably the most fun path :)
airlied: TMM: you might be able to do some small opts like nha did for FS without it.
airlied: TMM: MostAwesomeDude started looking at gallium on radeon as has glisse I think.
TMM: I figured that, but I can't come up with a good developement cycle for the vertex shader compiler
TMM: I kinda feel that duming it all in galladium and creating an llvm profile from the r500 specs would be easier to debug
airlied: it certainly would be an interesting project..
TMM: I just wished that I had figured out my problems were with the vertex compiler before I shelled out for a faster radeon :)
TMM: radeon appears to be slower in other respects than fglrx too, by a good margin. Are there any thoughts on what makes fglrx fast even for 'stupid' poligon pushing?
TMM: I mean 'faster' :)
spstarr_work: airlied: i think a Gallium(3D) driver is where we want go much like Nouveau
airlied: spstarr_work: never... what makes you think that? :-P
spstarr_work: ;)
airlied: TMM: would need to use fglrx first, but they have a quite large team of developers optimising it for every card and lots of games, so I suspect that alone is sufficent :)
TMM: I've looked at the nouveau code, and did some diffs, seems to me like porting r300 to gallium, while far for trivial, would be doable
TMM: airlied: well, I was quite miffed that even glxgears goes at more than 2x the speed... and that's hardly complicated gl code :)
airlied: not sure we support hyper-z yet on r300 though.
TMM: airlied: 12000+ fps to around 5000 for radeon
airlied: TMM: it still uses a vertex/fragment shader.
TMM: it does? since when? :-/
airlied: TMM: there isn't a fixed function pipeline anymore, so everything ends up using shaders under the hood
airlied: I'm not sure how large the default shaders are or if we could optimise them more though.
airlied: also hyperz would help gears.
TMM: well, right now I'm just hitting a wall with some programs that apparently want to push more commands
TMM: "Ran out of temps, num temps %d, us %d\n", vp->num_temporaries, u_temp_used); <-- that one
spstarr_work: airlied: I think of it this way, if we have a solid drm radeon driver toss out the old r300 driver and if the design of Gallium is right, we should have a much more reliable stack
spstarr_work: then fix the GLX to be more robust
airlied: spstarr_work: GLX is robust, you just have a bug that nobody cares about :)
airlied: don't attribute something not working for you, as a major design issue :)
spstarr_work: but you shouldn't think that, just because im tripping it in KDE doesnt mean it doesnt lurk for others out there
robokopp: <- cares at least :p
spstarr_work: robokopp: did you try?
TMM: I suppose that I could try and just increase the VFS_MAX_FRAGMENT_TEMPS macro and see what happens :P
TMM: but I wouldn't really know what I'm doing :)
robokopp: not yet... i got some boost problems here :(
spstarr_work: airlied: I noticed something in glxdri.c that caught my attention
airlied: it would be interesting if we could get a simple reproducer for that bug that wasn't kwin
spstarr_work: airlied: i hope to find out something else to trip it (maybe using the old xcompmgr)
spstarr_work: airlied: in glxdri.c theres one routine that does no sanity checking for validity
spstarr_work: it just frees something
spstarr_work: it is:
airlied: thats probably fin..
airlied: the problem isn't sanity checking.
spstarr_work: glXDRIcontextDestroy
airlied: you can't sanity check a pointer that someone else frees.
TMM: what's wrong with GLX?
airlied: you still think you have a valid pointer.
airlied: so you go and use it.
spstarr_work: airlied: but a driContext should not be assumed in this routine?
airlied: it is probably a missing callback to tell the DRI driver the context or drawable is deleted.
spstarr_work: shouldn't there be a (if context->driContext != NULL) ?
airlied: spstarr_work: it might be vvalid to assume a context.
airlied: spstarr_work: you might know it won't ever be NULL.
spstarr_work: hmm
spstarr_work: i have some debug on my server
spstarr_work: http://www.sh0n.net/spstarr/Xorg-glx-debug
spstarr_work: what stands out is getDrawableInfo
TMM: airlied: also; you said 'everything's a shader program now' is that directly true? ie, do ALL GL calls now get converted to shader or vertex programs before they get send to the card?
airlied: TMM: for shader cards..
airlied: so r300 and up for radeons are all shader based
spstarr_work: airlied:
spstarr_work: getDrawableInfo: USE: Address of data (which *should* be a __GLXDRIDrawable) is: 0xbf926cc4
spstarr_work: getDrawableInfo: USE: Address of data (which *should* be a __GLXDRIDrawable) is: 0xbf926c74
TMM: airlied: where does this conversion happen?
spstarr_work: can a __GLXDRIdrawable be BOTH pointing to the same pDraw?
airlied: TMM: in Mesa
airlied: spstarr_work: probably not sure about that.
airlied: TMM: lemme find it.
airlied: TMM: src/mesa/tnl/t_vp_build.c
airlied: src/mesa/main/texenvprogram.c
TMM: there regular GL calls get converted into GLSL, which then gets converted to the card's native bytecode in the various r300_* files?
airlied: TMM: not really they get converted into Mesa's intermediate representation
airlied: spstarr_work: I doubt that is the problem.
airlied: spstarr_work: I think we've explained what the problem is a few times now.
spstarr_work: well yes, you said context
spstarr_work: but the stack shows we fail in DRIGetDrawable() somewhere
spstarr_work: so either a context is being clobbered or a drawable is
spstarr_work: if what im seeing is the result of this (the above pastes), then i can begin to drill down into the server side DRI code and then the DRI r300 code
TMM: airlied: ah, ok. and the mesa bytecode then gets compiled into the card's native bytecode?
airlied: TMM: pretty much..
airlied: spstarr_work: it looks like a drawable goes away on that the r300 driver doesn't know goes away.
airlied: its unlikely to be in the creation code, more likely to be in the deletion code.
TMM: airlied: and fglrx uses this too? ie; performance (loss) is not due to mesa being inefficient converting glsl/gl calls into it's own representation?
TMM: airlied: I know they use DRI, I'm not sure they use mesa though
spstarr_work: airlied: or not goes away but is left dangling ?
airlied: TMM: not sure what fglrx does, but its not likely to be that.
airlied: spstarr_work: it goes away in the higer layers but nobody tells the drivers.
spstarr_work: isn't he GLX supposed to do that?
airlied: spstarr_work: it might not be doing it.
spstarr_work: thats what im trying to debug
airlied: TMM: hyper-z and/or shader compiler is where I'd place the bets
spstarr_work: and if we're having two drawables pointing to one pDraw if you delete the drawable of one, the other is either NULL or garbage
TMM: airlied: ok, I am just orienting myself, I hope you don't mind the n00b
spstarr_work: and getDrawableInfo() appears to be a callback
spstarr_work: im not sure how this callback mechanism works in X
airlied: TMM: as long as you ask questions going in the intelligent direction :)
spstarr_work: airlied: Is there a way to walk though all the drawables?
spstarr_work: or you're supposed to have only one drawable , use it, release, it, delete it?
TMM: airlied: are they? :)
airlied: spstarr_work: you can have multiple drawables afaik
spstarr_work: ok
spstarr_work: it doesnt seem to make sense that multiple drawables would point to a pDraw of the same though
spstarr_work: thats seriously asking for trouble if you don't know which drawable really still needs the pDraw
airlied: I think you get a draw and a read drawable which can point at the sam thing
spstarr_work: well we have two different contexts for each drawable (draw, read)
TMM: airlied: ok, very well. a vertex compiler it is then. I really need to start reading the chapers on 'optimization' in my dragon book I suppose :-/
airlied: TMM: or reading llvm :)
spstarr_work: actually no
spstarr_work: im wrong
airlied: spstarr_work: you could try running some other apps that don't die.
spstarr_work: airlied: the drawable uses the same pDraw, drawGlx context, and readGlx context
spstarr_work: airlied: thats what I've been doing for the last 5 months+ :)
airlied: spstarr_work: most likely the design is right, its just a missing cleanup path.
spstarr_work: avoiding using composite
TMM: airlied: or that... I'm just starting to wonder again what path to take
TMM: airlied: screw it, gallium has to be done anyway right? and noone else has any plans?
glisse: TMM: it's on rader on few people :)
glisse: just a matter of time, but we definitly want to move their
glisse: s/rader/radar
TMM: hmm, well. I suppose if I start something, even when it's bad, I suppose that maybe it'll kick some of you more suitable people to actually do it properly ;)
TMM: I'll just give it a go, I'll put it on my svn server and ask for comments regularly, to see if I'm going in the right direction
TMM: anyway, time for bed and all, thanks for the info airlied
spstarr_work: airlied: in anycase, tonight is irq night, if i can get you the needed information
spstarr_work: i cant even debug the other issue if it keeps killing the drm driver too
spstarr_work: "When it rains, it pours"
spstarr_work: i will run radeontool and confirm if the irq is still on, will that help you narrow things down though
spstarr_work: <- home..
spstarr_work: i will read your responses from spstarr_desk
agd5f: TMM: lots of Z-related optimizations we don't take advantage of yet and we don't use tiling for textures yet
agd5f: also a lot of shader and texture optimization, like a shader program space memory manager, and cycling through texture units
spstarr: airlied: your radeontool has no support for RADEON_GEN_INT_CNTL :(
spstarr: tries to add it
spstarr: or do we want RADEON_GEN_INT_STATUS to add
airlied: radeontool regmatch GEN_INT_CNTL
spstarr: none found
airlied: or radeontool regmatch 0x40
spstarr: oh
airlied: if you are using my radeontool and not the fedora one.
spstarr: yes
spstarr: right now it's
spstarr: GEN_INT_CNTL (0040) 0x00000000 (0)
spstarr: lemme start X and try to blow up things
spstarr: airlied with X loaded
spstarr: airlied here's the results:
spstarr: if im outside X with drm driver loaded
spstarr: GEN_INT_CNTL (0040) 0x00000000 (0)
airlied: that won't help
airlied: .
airlied: when it drops the IRQ I need it, with X still running
spstarr: GEN_INT_CNTL (0040) 0x02000000 (33554432) when switched
spstarr: when irq is cut
spstarr: GEN_INT_CNTL (0040) 0x02000000 (33554432)
airlied: spstarr: hmm nothing wierd there.
spstarr: thats the state i see it in if glxgears barfs
airlied: hmm dang doesn't seem to be ignoring the interrupts then
spstarr: but i have to quit X if i want to reenable irq 9
spstarr: and reload e1000 manually to kick it
airlied: yup I only wanted to figure out if radeon was getting irqs it didn't know about.
spstarr: i dont know what 0x20000000 state is
spstarr: im interested in why im using XT-PIC
spstarr: i wonder if i force kernel to use ioapic
spstarr: brb
spstarr: it seems it will use XT-PIC for interrupt routing always
spstarr: turning ioapic or turning on the lapic
spstarr: this isn't good.. so you dont think its the drm driver now?
spstarr: hmmmm
spstarr: sighs
Remosi: spstarr, so what conclusions do you have at the moment?
spstarr: one step forward, 2 steps back :(
spstarr: Remosi: well, i have a blocking bug that is blocking the other bug
Remosi: you still think it's radeon?
spstarr: i dont know, airlied doesn't know if it is or not yet, we're both not sure :/
spstarr: so nevermind the GL crashes, if im having IRQ problems then im blocked
spstarr: i will just have to use 2D land
spstarr: or, wait for kms to replace the need for this crap we have right now ;)
spstarr: which might be the best approach, airlied if you get time for agp + kms, then we can see if the irq issues vanish altogether
spstarr: Remosi: now you see why the kernel must do video mode switching and not X
airlied: spstarr: it won't help.. the irq isn't related to that.
airlied: spstarr: you could try a kernel without modesetting in it
spstarr: im going to be after yes
airlied: spstarr: just rebuild Fedora kernel without the drm patches
spstarr: since im off til tuesday i have a lot of time to debug this all
spstarr: and enjoy some of the last of summer
Remosi: spstarr, you're obviously in the wrong hemisphere
Remosi: :)
airlied: if rebuilding the fedora kernel without drm patches fixes the irq then modesetting broke it.
spstarr: Remosi: its fall now, but we're going to have an Indian summer starting Friday til Monday
spstarr: airlied: just disable the modesetting patch?
Remosi: spring here :)
spstarr: i can do that sure
spstarr: powers on quad
airlied: spstarr: disable the drm patches they are all in one place
spstarr: ya i see, very nice
spstarr: gimme 2 hours or so for the kernel to build
spstarr: drm-modesetting-radeon.patch done
spstarr: ok more then that all of them ;p
airlied: yeah you need to remove all of them they depend on each other
spstarr: i noticed :)
spstarr: that would be a conclusive marker if that's the case
spstarr: waits
spstarr: waits
spstarr: done building, now rpms being spit out
spstarr: done!
airlied: okay got the clipping bug gfixed.
airlied: my X isn't even registering an irq here.
spstarr: ?
spstarr: rebooting now with no kms kernel
spstarr: we
spstarr: weeeeeeeeeeee
spstarr: here goes....glxgears...
spstarr: PASS
spstarr: although the glxgears look corrupt
airlied: spstarr: I have a problem locally alright.
spstarr: it didnt stall out...
spstarr: airlied :-)
airlied: with the irq going wrong..
spstarr: airlied: irq?
airlied: not hanging.
spstarr: thats not good
airlied: X is just not getting an irq for the drm at startup]
spstarr: 7885 frames in 5.0 seconds = 1575.250 FPS
spstarr: 8
spstarr: interesting
spstarr: i wonder if thats related or the result of what im seeing
spstarr: which is why /proc/interrupts shows no radeon
airlied: probably is.
spstarr: lemme see if it does now
spstarr: BINGO
spstarr: 9: 68975 XT-PIC-XT acpi, uhci_hcd:usb2, eth0, radeon@pci:0000:01:00.0
airlied: it says it in the Xorg log
spstarr: airlied: doh
airlied: quite explicitly
spstarr: (II) RADEON(0): [drm] failure adding irq handler, there is a device already using that irq
spstarr: [drm] falling back to irq-free operation
spstarr: bah!
spstarr: goes to show i didnt even take notice
spstarr: that was .old
spstarr: closes kernel.org bug
spstarr: no point in them worrying about it
spstarr: the kernel did the right thing
mentor: peers expectently at airlied
spstarr: with that fixed, i can resume with the glx debugging start debugging the server DRI
terracon: w00t . go spstarr
spstarr: terracon: im determined to find the race
spstarr: no matter how many ErrorF()'s I have to put
terracon: make it so!
spstarr: its a slow process, i just need to know if im on the right path
airlied: you can resume glx debuggiong on the kernel you are using now :)
spstarr: very true..
spstarr: looks at spstarr_desk
spstarr: ok, adding debug in server DRI bits now for drawables / contexts
spstarr: then i'll add them to the the DRI driver in mesa and rebuild that
spstarr: ProcXF86DRICreateContext
spstarr: im not sure if I need to look at the Proc* routines
spstarr: just dri.c I think
spstarr: oh
spstarr: ?
spstarr: drmCreateContext()
spstarr: but thats coming from libdrm if im not mistaken
spstarr: a drm_context_t
spstarr: thats just an unsigned int
spstarr: drm_ctx is the struct
airlied: you can ignore drm contexts
spstarr: ok
spstarr: so i only should be looking at the server side DRI contexts created?
airlied: server/mesa interactions via the AIGLX loader.
spstarr: ok, so then now i need to look at the mesa DRI driver i guess only
spstarr: since ive got the aiglx bits spewing debug
spstarr: airlied: who calls the callbacks though?
spstarr: I see its hooked into the loader
spstarr: or is this done *once*
spstarr: it doesnt seem to be
airlied: spstarr: which callbacks?
spstarr: static const __DRIgetDrawableInfoExtension getDrawableInfoExtension = {
spstarr: { __DRI_GET_DRAWABLE_INFO, __DRI_GET_DRAWABLE_INFO_VERSION },
spstarr: getDrawableInfo
spstarr: };
spstarr: this is being loaded into something
spstarr: loader extensions
spstarr: function pointer for a createNewScreen routine passes a list of function pointers of the loadable extensions
airlied: and the driver deals with them.
spstarr: and by driver you mean the dri driver itself?
airlied: code in dri_util.c in mesa I think
spstarr: ok so thats the connecton to mesa side
spstarr: so im done for X server onto mesa
spstarr: yep dri_util.c
spstarr: "This module acts as glue between GLX and hw driver"
spstarr: oh there's a reference count for contexts
spstarr: its interesting
spstarr: /* XXX this is disabled so that if we call SwapBuffers...
spstarr: there's actual code to set the context to NULL
spstarr: er the drawable
spstarr: but its disabled
spstarr: airlied: what does mesa use for debug? fprintf?
spstarr: or it has its own ability to write to X log
spstarr: looks to be so
spstarr: I should turn on RADEON_DEBUG DEBUG_DRI
spstarr: hm
spstarr: we do assert(r300)
airlied: spstarr: fprintf.. it doesn't write to the logs.
spstarr: ok
spstarr: i see there's assert checks in r300DestroyContext, and if this is the case i dont know why X would crash rather than seeing a assert
airlied: spstarr: you seem to misunderstand pointer validity.
spstarr: well if a pointer is set to NULL it's certainly not freed
airlied: spstarr: okay I'll explain it again, cut-n-paste this :)
airlied: spstarr: A gives B a pointer, B stores the pointer somewhere, A frees the pointer and sets its copy to NULL, it has no idea where B has stored the pointer.
airlied: B's pointer is !NULL it still the original value, but is now pointing at freed memory.
airlied: you cannot validate B's pointer.
spstarr: its dangling
airlied: yup, the next time something in the r300 driver references it, it expodes.
airlied: you can't assert though as there is no way to know the pointer is gone unless A tells B.
airlied: you need to find out why A doesn't tell B or why B doesn't deal with it.