g-zu: for anyone interested - I did some (a lot) of 2D benchmarks with both radeonhd and radeon drivers on rs690 with different X setups... they are available via ftp://g-zu.no-ip.org/x11perf.tar.bz2 - use x11perfcomp to compare the results
g-zu: if someone could host these results at a more permanent address I'd be greatfull
Ryszu: Do you have a dri result for the radeon driver?
g-zu: yea, dri-exa and dri-xaa... though dri-xaa froze my X
g-zu: download them and you'll see... also read the README... you'll find a lot of details
Ryszu: Yeah, I've taken a look
g-zu: erm... unpack them with complete path
Ryszu: I've done that, and run x11perfcomp to generate the comparison output
g-zu: there were put into directories ati for radeon and radeonhd for radeonhs
g-zu: and use x11perfcomp -r
g-zu: gives relative performance in percent
g-zu: it's more readable
g-zu: actually not percentage... but it's relative
g-zu: or -ro gives only relative performance making it even more readable, and you can also compare more than 2 files at once
Ryszu: http://www.beyond3d.com/private/oss/g-zu/results.txt there's all 9
udovdh: Ryszu, maybe graph these numbers?
Ryszu: Yeah, working on that now
g-zu: Ryszu I made a html table out of them in the meantime :)
Ryszu: I think there's too many series to graph in any kind of sensible way
Ryszu: Ordered using average relative perf: radeonhd/dri, radeonhd/dri-xaa, radeonhd/xaa, ati/xaa, radeonhd/dri-exa, ati/dri-exa, radeonhd/exa, ati/exa
Ryszu: Fastest to slowest
Ryszu: ati/dri-xaa not there because it couldn't complete all the tests
g-zu: Ryszu could you also group them somehow by test-size... there are many tests using 1x1 10x10 100x100 and 500x500
agd5f: Ryszu, g-zu: if you were using my rhd radeon port branch, the only accelmethod implemented is exa
g-zu: and results vary a lot with size
g-zu: oh, so that's why xaa almost didn't change at all :)
udovdh: bar graphs per test?
agd5f: everything else falls back sahdowfb
udovdh: makes amny graphs
udovdh: how to make a summry out of it?
udovdh: only shadowfb for me I guess (rv630)
g-zu: agd5f any explanation why the big copy tests( 100x100 and 500x500 ) worked so much differently with dri-xaa combination
g-zu: it's really hard to summarize, that's why I gave the full results... also took me about a day to run all those tests
g-zu: with interruptions of course
agd5f: g-zu: with shadowfb it's all done is system memory
g-zu: udovdh I will also make a pure shadowfb test this night... so you
udovdh: can compare?
g-zu: udovdh yes
udovdh: with some graphs the results could be made into an article by some website
udovdh: I guess
g-zu: so a bit of clarification... dri is actually dri-none and dri-xaa should have been dri-shadowfb for radeonhd
g-zu: udovdh it would make tens of pages of graphs
udovdh: yes, raw numbers
udovdh: and a few graphs of interesting tests
udovdh: to show differences
udovdh: between drivers
udovdh: and accelleration methods
udovdh: some sort of 'line' must be visible in this heap of numbers?
g-zu: I updated the tar archive and results.html from ftp://g-zu.no-ip.org to reflect the things agd5f told me
g-zu: I will be busy for a couple of hours, but after that I probabily will write a php script to dynamically draw some graphs from all those numbers depending on some selections users make
bridgman: q-zu: for radeonhd, did you use the master or agd5f's "quick and dirty" branch ?
Ryszu: The latter
g-zu: bridgman as it says in the readme xaa, exa and dri-none tests are done using the master, dri-shadowfb and dri-exa from agd5f's branch
bridgman: sorry, I went straight to Ryszu's page and missed the readme ;)
Ryszu: Readme there
bridgman: you're not gonna let me miss that are you ;)
libvde: Ryszu: after a quick glance through the irc backlog: radeonhd xaa rocks?
Ryszu: As long as bigger numbers are better ;)
libvde: ah... so no?
Ryszu: Hehe, yeah, I think that's the case
Ryszu: radeonhd/xaa seems quickest
g-zu: take a look at http://g-zu.no-ip.org/
bridgman: and higher numbers are definitely better ?
g-zu: I put the correct readme and a results.html file there... along with all tests so far
g-zu: bridgman yes, higher numbers are better in all tests
g-zu: the time was the same, 4 seconds each
bridgman: the disturbing thing is that shadowfb and dri-none both kind of rock...
Ryszu: http://www.beyond3d.com/private/oss/g-zu/results.html formatted a little better (hopefully)
udovdh: nice list
g-zu: Ryszu it's definitely more readable
udovdh: dri needs improving
udovdh: shadowfb is damn well programmed?
libvde: udovdh: radeonhd-dri is no acceleration for 2d ops atm
udovdh: that explains
udovdh: maybe such facts would be welcome in a technical readme?
udovdh: redo the test in a month or 3 and compare...
g-zu: for most people 2d being more about text I'd say the char tests are the most relevant... and there dri-exa definitely rocks
libvde: udovdh: all it takes is hooking up drm cp
libvde: "all", as cache flushing and engine idling, start, restart, etc all need to be handled
udovdh: yes, to the user it all looks simple
udovdh: I hope you can pull it off!
bridgman: libvde: I think this was done off agd5f's quick and dirty branch so drm cp
bridgman: should already be hooked up
bridgman: for exa at least
bridgman: still not sure why dri-none is so fast ;)
g-zu: I think I should move the README into the html to avoid this confusion
bridgman: that would sure help me ;)
bridgman: we should do the same for Ryszu's page if possible
bridgman: cool, thanks;
bridgman: any ideas on why dri-none is so fast ?
bridgman: should be slower than shadowfb, shouldn't it ?
rehabdoll: is there an estimate when xvideo will work for r600 ?
bridgman: maybe "when I get the damn docs out the door plus 2 weeks" ;)
rehabdoll: oh, i thought that was included in the recently released docs :>
bridgman: that's "first video runs for a while without crashing", not "polished" though..
bridgman: the hardest stuff to understand was released (at least most of it) but
bridgman: there's still some more needed to get things running
bridgman: for 5xx the basic programming was close enough to 3xx/4xx that we didn't need
rehabdoll: oh ok, thanks
bridgman: sample code but for 6xx I think some sample code is going to help; we don't
bridgman: have anything to act as a sample for the userspace code yet; tcore really mostly
bridgman: covers the drm stuff
lupine_85: yay, drm
bridgman: good drm, not evil drm ;)
lupine_85: absolutely :)
lupine_85: still no ETA on that, then?
udovdh: `when it's ready` (tm)
lupine_85: that's not a time :p
udovdh: before christmas
bridgman: weeks not months
lupine_85: which year?
lupine_85: thanks bridgman
bridgman: was hoping to get more done yesterday but ran out of coffee
rehabdoll: teh horror
bridgman: yeah, it was like I had been drugged or something
lupine_85: do we need to set you up with a brib^Wcoffee fund?
bridgman: nah, I just need to take the empty coffee can out of my cupboard so
bridgman: it doesn't look like we have lots of coffee
lupine_85: ah well, I'll leave you to do that and I'll go do some dancing :)
bridgman: ok, off to cut more grass; mad dogs & englishmen, you know
bridgman: someone please try to explain why dri-none was so fast ;)
agd5f: dri-none on my branch is probably shadowfb, but I haven't really looked at the logic. all I've test so far is exa
bridgman: agd5f: that's the mystery. According to readme the dri-none test was done on master
bridgman: which should really fall back to none, ie no shadowfb
bridgman: I could understand if dri-none was run on your branch
g-zu: it wasn't
g-zu: and there's significant difference between dri-none and dri-shadowfb
g-zu: you can compare them yourself to see
g-zu: window operations are especially slow with pure dri
d2kx: can anyone help me getting the quick_and_dirty_accel branch of agd5f's private radeonhd git directory?
d2kx: radeon+EXA is so damn fast that I want to try it out with radeonhd, too ^^
d2kx: i did some radeon+XAA/EXA vs. radeonhd+XAA/EXA vs. fglrx gtkperf benchmarks in case anyone's interested: http://global.phoronix-test-suite.com/index.php?k=profile&u=d2kx-10046-8436-23452
bridgman: libv would probably want it pointed out that the EXA code in radeonhd already would
bridgman: probably be just as fast once drm cp paths were added, which is correct
bridgman: what we're also picking up from the fresh port is EXA render, composite, textured
bridgman: video and all the infrastructure that supports it
g-zu: d2kx run git clone git://people.freedesktop.org/~agd5f/xf86-video-radeonhd rhd-accel; cd rhd-accel; git checkout origin/quick_and_dirty_accel
d2kx: thx g-zu, will try this now
g-zu: libv are you around?
bridgman: it's Miller Time in Europe ;)
g-zu: well, anyone experienced than?
libvde: g-zu: yeah, i am now, what's up?
libvde: miller time... i wish :)
libvde: i cycled 180kms this weekend, of which 50 trying to beat 2 others taking the train (the slow paths here in nurnberg killed us)
libvde: miller time will be soon though :)
so1: omg ... from 2:0 to 3:2 ...
so1: did someone watch it?
MostAwesomeDude: so1: ?
g-zu: and hello again... haven't noticed before but just ran a dmesg and saw tons of the following message
g-zu: [drm] wait for fifo failed status : 0x9803C100 0x00080100
g-zu: along with a couple of [drm] wait idle failed status : 0x9803C100 0x00080100
glisse: g-zu: did it lockup ?
g-zu: that message means the operation timed-out
glisse: g-zu: what card ?
g-zu: as far as I can tell by looking into the code
glisse: well we are badly detecting idle and that kind of things
glisse: i should disable this message it can be misleading
glisse: though your engine report idle
glisse: g-zu: is system is responsive even when this message appear in the log ?
glisse: s/idle/not idle
g-zu: well, the fifo failed appears when the system is under >90% busy-wait
g-zu: and the idle failed on other ocasions
g-zu: I can actually reproduce the wait for fifo failed at any given time
glisse: g-zu: so x is not responsive right ?
g-zu: negative, X is responsive
g-zu: this is confusing
g-zu: so when I press ctrl+c the application stops, from X
g-zu: so it responds
g-zu: it's not very quick, but I don't think it's supposed to be
glisse: g-zu: so you can kill the application
glisse: and get back a functional x right ?
g-zu: glisse yes, I get back a fully functional X
g-zu: the thing is as far as I can tell that idle slot calculation is incorrect for rs690
g-zu: the documentation states it's bits 19:18 of the RRBM_STATUS
g-zu: and those are all set... so the max number of slots are available
glisse: g-zu: which doc ?
g-zu: page 356
glisse: this isn't the same reg
g-zu: well, I can't find that reg in the docs
glisse: RBBM_STATUS is at 0x1740
g-zu: which doc?
glisse: i don't think it's in any doc
glisse: well no i can't find it in doc
glisse: however the header have all the bit description about it
g-zu: glisse well in that case the mask is wrong
g-zu: oh sorry
g-zu: I forgot it's hex
g-zu: mask is ok
glisse: g-zu: can you run another 3d app after killing this one
glisse: and does this 3d work ?
g-zu: all works fine
g-zu: even that application runs normally
g-zu: I just get a interminable kernel log
glisse: very strange
glisse: when the engine report busy as you wait for it you should not be able to kill the beast
g-zu: probabily at some point it's no longer busy and I can :)
g-zu: you think I should try increasing the timeout?
g-zu: or maybe try reading from the other register I found documented instead?
g-zu: after rereading the meaning of that register I found in the docs I don't think it's automatically decremented... I think it can only be altered by a write
g-zu: glisse how about if I increase the timeout
glisse: g-zu: what you mean by automatically decremented ?
glisse: RBBM_STATUS is read only
glisse: the number of fifo is reported by the gpu
g-zu: yeah, but the other one... I found in the docs... with WATERMARK in the name isn't
glisse: as all other state
glisse: you should not touch this one
g-zu: and it appears that right after the time-out it has 8 free slots
g-zu: for some reason
glisse: g-zu: if you relaunch you will likely see different value
g-zu: it's always the same
glisse: my guess is that we need to write some register like ztop to assert idle
glisse: g-zu: reboot your computer or quit X and relaunch a 3d app i would not expect that you see same value
g-zu: I already rebooted
g-zu: 3 times
g-zu: it's the same
g-zu: I restarted X, tried other apps
glisse: it all a matter of how much the gpu can fill the fifo when reaching a wait_until
g-zu: always the same values
glisse: okay we are not tallking about same bit
glisse: i was talking about num of free fifo
glisse: so it say that GUI is busy and 2d is active
glisse: so for some reason 2d is stuck
glisse: which is rarisim in my experience
g-zu: glisse firefox manages to create that event every now and then, depends on the page... but I can always successfully create it just by running x11perf -rect500
glisse: what does this test do ?
glisse: fill a rect of 500x500 with a color ?
g-zu: creates a lot of 500x500 rectangles
g-zu: they're just black
glisse: with color or texture ?
g-zu: it's not a fill test
g-zu: fill tests have fill in their name
glisse: g-zu: can you relaunch the same test after killing it ?
g-zu: actually any number of times
glisse: i guess -rect100 doesn't trigger this ?
g-zu: X is incredibly stable lately
g-zu: -rect10 doesn't
g-zu: rect100 does but not so often
glisse: so it's just us getting impatient after the 2d engine
g-zu: I guess
glisse: we really need to find a better way to wait for idle
glisse: but i suspect X is sending 2d cmd in our back while we wait for idle
glisse: which is not helping
g-zu: actually not really... processor is 97% in busy_wait with that specific test
glisse: you can move anythings while 97% ?
g-zu: let me try
g-zu: mouse barely replies
g-zu: like 1 out of 5 clicks
g-zu: keyboard works somehow
glisse: well anyway i got to get some sleep
g-zu: glisse ok cu around
g-zu: good night