tinylights: I got an ATI 9600 radeon, the screen won't refresh until I resize windows! - http://rafb.net/p/uGaAN425.html
spstarr: mobile?
Revellion: morning #radeon
osiris_: glisse: your kms radeon driver works only with !avivo cards, right?
glisse: osiris_: right
glisse: and only vga output
osiris_: glisse: so what's the plan for avivo chips? separate driver?
glisse: the plan is to port radeon ddx to kernel
osiris_: without splitting avivo and non-avivo?
glisse: the code will be split
glisse: a little bit more than it is now in the ddx
osiris_: glisse: has anyone started working on xf86-video-kms driver?
rx__: isn't that just the fbdev hack glisse made?
glisse: rx__: no xf86-video-kms is not the previous hack of fbdev
glisse: osiris_: i will add basic exa latter this day to kms
rx__: searches around
rx__: oh interesting..
rx__: goes read the irc log
glisse: osiris_: if you want to add exa from radeon ddx to kms you more than welcome :)
osiris_: glisse: unfortunately I have only rs690 card so I wouldn't be able to check what I've written
osiris_: glisse: if you could add some dirty hack to make my rs690 work with radeon kms, I would be more than willing to do exa
glisse: osiris_: i or someone else will redo the modesetting part soon
arekm: "kms radeon driver"? what's that? :)
mikkoc: is 3d supposed to work with radeon, drm and mesa from git? on ati mobility X1400
mikkoc: $ glxinfo | grep dir
mikkoc: direct rendering: No
osiris_: arekm: kernel modesetting driver
osiris_: glisse: what's is your vision of implementing exa through kms drivers?
glisse: osiris_: mostly copy code
glisse: and use ttm
osiris_: glisse: but if we want to have one xorg kms driver (independent to kms backends) we would need some new interface, right?
glisse: osiris_: well we can separate the exa from the modesetting cleanly
glisse: without help of xserver
osiris_: glisse: I can't see how it could be done.
osiris_: xserver will use two video drivers (one for modesetting only, second for accel related stuff)?
OipOS: Hey. It seems that after upgrading mesa/xserver/xserver-video-ati to git, my hardware acceleration is acting weird. Here's the output of glxinfo(doesn't run normally): http://pastebin.com/d271c7bbd
Revellion: leio: morning
leio: evening
Revellion: :), also following #15371? :)
leio: err, yes
chrisdi: Good morning, evening, etc. to all of you. :-)
OipOS: Yo.
chrisdi: I'm having trouble getting my Radeon R9250 completly up and running with X.org 7.3 and the OS driver radeon. Accordingly to glxinfo 3d acceleration is up and running. But then I'm trying to play ppracer (Planet Penguin Racer) aka TuxRacer I only get a black screen and my Monitor starts searching for a signal on DVI and VGA inputs ...
chrisdi: I allready tried out to set several Options in the Xorg.conf, but noe of them helped ...
chrisdi: The other problem is that he graphic output of KDE (3.5.8) get's corrupted then I try to use the pivot function. Almost 2/3 of the screen are rendered correct, but the rest is only displayed as garbage ...
chrisdi: Anyone has some tips about this issues?
mikkoc: can anyone help me with 3d on r500 please?
otaylor: Revellion: around? Have some time to continue debugging?
agd5f: otaylor: I can help. BTW, thanks for digging into this.
otaylor: agd5f: "enlightened self interest" ... you have a r200 available?
agd5f: sure
otaylor: agd5f: makes sense :-) So, the not-quite-working thing I was investigating with Revellion yesterday was git master + the patch from 15334 and the two patches from 15371
otaylor: agd5f: I should publish a tree somewhere, but still just a bit shaky with git :-)
agd5f: otaylor: let me swap in the r200
mikkoc: is the drm kernel module needed to make 3d work with r500?
otaylor: mikkoc: You aren't going to get 3d without a drm kernel module certainly
agd5f: mikkoc: yes. and you need airlied's r500 mesa tree
mikkoc: uhm, so mesa from git isn't enough?
agd5f: mikkoc: nope
mikkoc: ahh ok
mikkoc: thx
mikkoc: this looks pretty old: http://cgit.freedesktop.org/~airlied/mesa/
mikkoc: where do i find the right one?
eboettcher: agd5f: I'm working on my gsoc proposal and I'm wondering if you're still the mentor for r300 DRI stuff as the wiki says, and if I should apply to xorg or google. (again the wiki says xorg, but I'm just checking here since that page has not changed in a while)
agd5f: mikkoc: r500test
agd5f: eboettcher: submit your proposal via the xorg gsoc
eboettcher: over the course of today I'll bounce ideas on dri-devel
agd5f: otaylor: building the patched driver now
agd5f: otaylor: and, I'm up
otaylor: OK, so first question I have is whether http://fishsoup.net/tmp/simple-repeat.py has buggy output
bridgman: agd5f: beat me to it... http://cgit.freedesktop.org/~airlied/mesa/log/?h=r500test
otaylor: It should look like http://fishsoup.net/tmp/simple-repeat.png
agd5f: looks good
otaylor: agd5f: Hmm :-(
otaylor: agd5f: OK, so let's see if you see the problem that Revellion was seeing with http://fishsoup.net/tmp/repeat.py
otaylor: Oh, I know what's wrong with simple-repeat.py. Let me do a new version
otaylor: agd5f: OK, new python and reference images up (same urls)
agd5f: otaylor: also looks good
otaylor: agd5f: OK, so let's look at http://fishsoup.net/tmp/repeats.py
agd5f: otaylor: actually maybe slightly different, but barely noticable
otaylor: agd5f: slightly different might be interesting. If it's in the repeating section, and not the antialiasing around the edges (a few bits of difference in the antialiasing isn't signfiicant though I don't expect it)
agd5f: otaylor: looks like the edges. you want a screenshot?
otaylor: That would be great
otaylor: there's a reference image for repeats.py at http://fishsoup.net/tmp/repeats.png ... that's actually what I was looking at with Revellion yesterday on the r200, I was just hoping I could boil it down to something simpler
mikkoc: how do i download the r500test branch and make it work? :D
mikkoc: i already have mesa and drm from git
Revellion: good evening folks :D
agd5f: otaylor: http://www.botchco.com/alex/otaylor/agd-simple-repeat.png
spstarr: hmm otaylor your patches worked fine, chance agd5f etc could merge them into master?
agd5f: spstarr: we're still working on them
spstarr: ok
otaylor: agd5f: Yeah, that looks fine (the reference image is a little scaled down there by nautilus, which is the difference in your screenshot)
agd5f: ah ok
agd5f: otaylor: yeah, unscaled they look identical
otaylor: Revellion: Actually, if you are around, can you try http://fishsoup.net/tmp/simple-repeats.py (reference image .png) on your setup, to see if there's a difference (Xserver, pixman, hw, whatever) between your setup and agd5f's?
otaylor: Revellion: sorry, url is http://fishsoup.net/tmp/simple-repeat.py
glisse: osiris_: on load ddx is given the pciid, so we can choose accel code we register into Xorg according to this id
Revellion: otaylor: sure
Revellion: takes awhile to resolv that url for some reason
Revellion: otaylor: screenshot of the window contents of that rendering?
osiris_: glisse: so xserver kms driver will have accel code for all chips?
otaylor: Revellion: Yeah, though if it looks just like http://fishsoup.net/tmp/simple-repeat.png, then no need
agd5f: osiris_: kms won't have accel code at all
Revellion: the Software versions here are (for reference): Xserver 1.4.0.90, Pixman 0.9.6 hw: Radeon 9200SE(128MB PCI) rv280 chip
glisse: osiris_: don't know yet, having one ddx for all kms is just a possible solution
glisse: agd5f: osiris is talking about a ddx which use kernel modesetting
agd5f: ah
Revellion: otaylor: it's a 1:1 match
Revellion: both entirely similar
osiris_: glisse: maybe it will be better to implement exa accel on top of gallium?
otaylor: Revellion: OK, then something is different about the real test
glisse: osiris_: right now we need basic exa accel in ddx for dri2 to work
osiris_: glisse: ok
Revellion: putting em both over each other using subtraction as compositing mode makes just a black picture with no patterns of difference :|
otaylor: Revillion: OK, what about http://fishsoup.net/tmp/repeats-4-8.py and http://fishsoup.net/tmp/repeats-4-8.png
osiris_: agd5f: repeating my yesterday question: "is there a reason why anisotropic filtering bits are missing in r500 accel docs?"
otaylor: (this is just the same program as yesterday, with preset values to narrow it to one image, to avoid me having to give you a series of keystrokes to hit)
glisse: osiris_: i think they are in the doc under AA_
Revellion: otaylor: differs
glisse: or are you talking about texture filtering ?
otaylor: agd5f: Does that set differ for you as well?
Revellion: doesn't match the original rendering you linked in png
otaylor: Revellion: With random junk?
Revellion: otaylor: nah some colors are wrong and and there are black lines above some of the gradients
Revellion: ill just put it up for your eyes
otaylor: Revellion: Well, that sounds like random junk to me :-)
Revellion: rofl
agd5f: otaylor: looks identical to me
Revellion: http://suzuka.anirev.net/~revellion/repeats-4-8-local.png
Revellion: my rendering
glisse: oh anisotropic is for texture, confuse with antiliasing too many word in filtering area :)
agd5f: osiris_: under review
Revellion: otaylor: also it differs when redrawn aswell
otaylor: agd5f: and if you hit r a few times, it stays looking good and unchanged? This implies that the bugs that agd5f is seeing (which should be in the software path) might be fixed in current X server or pixman
Revellion: agd5f: have you let it redraw?
agd5f: otaylor: yeah still looks good
otaylor: (Does server 1.4 use pixman for rendering? I don't recall)
Revellion: otaylor: it links against it here it appears
Revellion: libpixman-1.so.0 => /usr/lib/libpixman-1.so.0 (0xb7e4c000)
agd5f: I'm using git everything from a few weeks ago
otaylor: Revellion: You feel up to building a newer X server? I don't really want to spend time doing detailed debugging against 1.4, but I'd like to know if what you are seeing is actually Xserver version specific
Revellion: otaylor: no worries, got spare time anyways atm :)
otaylor: Revellion: the easier upgrade is probably pixman, worth doing first I guess
osiris_: agd5f: begining of chapter 9.4.1 there is reference to GB_TILE_SELECT, I think it should be GB_PIPE_SELECT
otaylor: agd5f: OK, so ignoring the bug that you are not seeing, I have one question before I try to do a new rev of the R200/R100 changes
agd5f: ok
otaylor: agd5f: It's about the check:
otaylor: if (pPict->repeat) {
otaylor: if ((h != 1) &&
otaylor: (((w * pPix->drawable.bitsPerPixel / 8 + 31) & ~31) != txpitch))
otaylor: RADEON_FALLBACK(("Width %d and pitch %u not compatible for repeat\n",
otaylor: w, (unsigned)txpitch));
otaylor: In R200TextureSetup
otaylor: that turned out to be unnecessary for R300, so I'm interested in an experimental test of whether it *is* necessary for R200, before I write code to work around it in software
agd5f: otaylor: ok
otaylor: Unless you are sure that it's necessary of course :-)
otaylor: So basically just #if 0 out the if ((h != 1) && ...test
agd5f: yup
otaylor: we're already setting TXPITCH unconditionally so if the hw can use TXPITCH to do address computations when repeating, removing the test/fallback should be the only thing necessary
agd5f: otaylor: same test?
otaylor: agd5f: repeats-4-8.py should be a good test of this
agd5f: otaylor: results are identical
otaylor: agd5f: Cool.
otaylor: agd5f: Of course, then I have the same question about r100 :-(
agd5f: otaylor: I'll swap in the r100
agd5f: give me 2 minutes
otaylor: agd5f: Actually, why dont' you wait on that for a bit
agd5f: sure
otaylor: agd5f: Let me try to get a complete set of working patches that you can test on r200, then you can go to the r100 and try them there
otaylor: agd5f: Rather than swap/swap-back
agd5f: perfect
spstarr: agd5f: did AMD release the r3xx specs for you (even though you work @ AMD :-)
agd5f: spstarr: ? I've put together all of the 3D stuff we've released lately. working on r6xx now
spstarr: agd5f: ok, so then its now a matter of cleanups for the r300?
agd5f: spstarr: yeah. cleanups, and probably some fixes now that we know better how some things work
spstarr: ok, just trying to get a figure because theres still a lot of software fallbacks :/
eboettcher: agd5f: I'm looking into http://dri.freedesktop.org/wiki/R300FragmentProgramOptimization right now. Are there any r300 specific fragment program docs I can look at?
otaylor: spstarr: I figured out yesterday that the subpxiel-text "fallbacks" don't end up to software, but to a two pass hw
agd5f: spstarr: yeah, lots of stuff isn't impemented yet
spstarr: agd5f: ok, fair enough
spstarr: otaylor: hmm
tstr: Hi!
agd5f: eboettcher: the r5xx 3D guide
osiris_: agd5f: what about r200 programming guide? after r600 docs?
agd5f: the theory is the same although the program loadign is different
otaylor: spstarr: You can figure out the fragment programs pretty well from the r300 register docs + the r500 docs
tstr: can someone help me with a radeon X11 driver question?
agd5f: osiris_: probably. we still haven't tracked down an editable copy
eboettcher: I'm going to propose general code cleanups etc, but if I get to work on this project I will probably be working on other details like bugs, fragment program optimization, etc
spstarr: I see
eboettcher: agd5f: is there are a lack of qualification tasks or anything of that sort?
agd5f: eboettcher: what do you mean? to apply for gsoc?
otaylor: tstr: "don't ask to ask, just ask" is the general rule
eboettcher: agd5f: for dri
eboettcher: several other projects have qualification tasks
eboettcher: but the due date is tomorrow so if there is anything of that sort I must get it done now
glisse: agd5f: btw do you know a good place where i can look at gamma palette and good safe default, other than radeon ddx where code is bit ununderstable to me
agd5f: eboettcher: just start playing with it. if you feel comfortable and think you can accomplish the task, go ahead and submit
agd5f: glisse: you mean the CLUTs?
glisse: DAC_CNTL things
eboettcher: agd5f: well right now I'm playing with it on an rs400 and there are issues :P
glisse: PALETTE_DATA
tstr: ok :) I have a R8500 card, I've installed Xorg 7.1, and the system crashes (complete ... not even pingable, no video). It seems it crashes in libglx, I've googled and it seems others have the problem too. glxinfo says: OpenGL renderer string: Mesa DRI R200 20060327 AGP 1x x86/MMX+/3DNow!+ TCL. Will the open source ati driver help me with this or am I completely on the wrong track?
agd5f: yeah. those are the CLUTs
tstr: (the system only hangs if I use something with 3d, like xlock with 3d screenblankers or sweethome3d ...)
agd5f: tstr: looks like you are using the open source driver
arekm: tstr: try latest mesa 7.0.3. it has one fix which did fix some r300 lockups for me
tstr: agd5f: it's the default that comes with Xorg 7.1, the line in xorg.conf is Section "Device", Identifier "ATI Radeon 8500", Driver "radeon", Option "DDCMode" "on", EndSection that weould be ok or?
glisse: agd5f: fbcon is setting only 15value and this looks good while my dumb 256 ramp values makes things look far too bright
agd5f: tstr: yeah
tstr: Is it sufficient to just recompile mesa or do I have to updater other libraries too?
agd5f: glisse: not sure. I'm not that familiar with fbcon. also if you are using dac 2, you'll need to make sure you set the dac adj values correctly
tstr: (because the mesa source tree is used to compile the X server as well ... )
glisse: agd5f: just dac1 but i don't understand the clut fbcon set
glisse: it is not like a linear gradient for all color
OipOS: tstr: safest is of course, to compile mesa, xserver and the drivers(keyboard, mouse, ati)
glisse: look like 5 first color CMYK and RGB
tstr: puh ..... ok. There's not an `alternate' driver for ati (except the ocmmercial one) that will fix the issue? Because somewhere I read the standard libglx that comes with Xorg is responsible for those crashes and it's kind of confusing me ... :/
agd5f: tstr: you might try a newer version xorg 7.1 is pretty old
arekm: tstr: the fix I'm talking about is http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=commitdiff;h=2407e48f2805e27e76e2e1d7083926c4077d9032;hp=5b91ee27c0f6e6379a9dc0bb41f4aef2f66b6346
agd5f: glisse: the CLUT is is used to convert logical colors into physical colors. the radeon dac supports 30 and 24 bit modes
tstr: Hmmmm. Good idea ... I didn't want to start with a cutting edge release and now I'm a little afraid of a complete recompilation since it takes so long :(
tstr: Ok, thanks for the help :)
agd5f: s/dac/LUT/
agd5f: glisse: you might want to make sure you are in 24 bit mode
glisse: agd5f: you are thinking to which register ?
glisse: i am definitly displaying a 24bit buffer
agd5f: glisse: 24 bit shoudl be a straight load.
glisse: doesn't seems to be
glisse: the fbcon gamma make my X look good
agd5f: glisse: I thought you said it was loading a depth 15 palette?
otaylor: agd5f: OK, new patch up on http://bugs.freedesktop.org/show_bug.cgi?id=15371, also published my git repo and put a reference there
glisse: agd5f: fbcon CLUT has only 15values but this make look my 24bits X framebuffer correct
otaylor: agd5f: If this passes testing on R200, it's should be good to test on R100 as well
glisse: also fbcon know it's using a 24bits framebuffer
agd5f: glisse: DAC_CNTL.DAC_8BIT_EN and I don't recall where the 30/24 bit stuff is off hand
agd5f: otaylor: cool
glisse: oh and i set both 24 & 30 pal to besure
otaylor: agd5f: Test is http://fishsoup.net/tmp/repeats.py http://fishsoup.net/tmp/repeats.png ... if that doesn't fit on your monitor fully, the most interesting part is the upper right corner
otaylor: Revellion: Don't apply this new patch if you are still working on building a new Xserver, it will obscure the bug that you are seeing :-)
arekm: otaylor: is r300 "desktop performance" affected by your changes?
otaylor: arekm: Yes. How much depends on what you are using for a desktop
arekm: otaylor: trying every fix/branch that people do in a hope that kde konsole scrolling won't be slow as hell ;)
otaylor: arekm: but the goal was to make exa usable with gradient-full gnome themes on my desktop
otaylor: arekm: Hmm, I dont' know if that will help with that or not.
arekm: testing (x600 mobility)
otaylor: arekm: sysprof is a good starting point for figuring out what is going on in a particular situation
otaylor: (you may find the x server is spending all it's time waiting for the gpu, but usually really bad performance is not bottlenecking on the gpu...)
agd5f: otaylor: looks identical
otaylor: agd5f: great. Can you screenshot it?
agd5f: yup
otaylor: agd5f: (it's sort of a complex image to detect minor differences in...)
agd5f: otaylor: http://www.botchco.com/alex/otaylor/agd-repeats.png upper right corner
arekm: otaylor: will see that too
arekm: sinceit's still slow + tons of debug messages: R300CheckCompositeTexture: Unsupported picture format 0x1011000
agd5f: otaylor: I can try and get the whole thing, but I need to switch the montior
arekm: R300CheckComposite: Component alpha not supported with source alpha and source value blending.
otaylor: arekm: oh ... are you using bitmap fonts on konsole?
otaylor: agd5f: Yeah, that looks perfect. Want to test on R100?
arekm: otaylor: seems soo (terminus font, pcf, http://www.is-vn.bg/hamster/jimmy-en.html)
otaylor: arekm: You might actually want to turn off the fallback tracing when testing performance ... if you are getting enough debug messages that's actually going to be your bottleneck
arekm: 11964 R300CheckCompositeTexture messages huh :>
otaylor: arekm: OK, that's going to be *slow* with exa. That's what the "Unsupported picture format 0x101100" is about
agd5f: otaylor: here's the full one: http://www.botchco.com/alex/otaylor/agd-repeats-full.png
arekm: otaylor: changing font then :)
agd5f: otaylor: I'll swap in teh r100 now
otaylor: arekm: It's basically fixable for the speed problem ... all you need to do is to get Qt (or Xft, not sure if Qt/Konsole is using Xft or not) *not* to be smart and try to use a packed bitmap for fonts
otaylor: arekm: that are bitmaps
otaylor: arekm: or with more effort, exa could do expansion from 1bpp to 8bpp when uploading into video memory
arekm: otaylor: this font looked nice, that's why I used it but well, I prefer speed. Now liberation mono (feels much faster)
otaylor: (first is a 1-day project, the second more like a 1-week project)
otaylor: arekm: You are still probably paying a bit of a penalty from the fallback logging, since you will be getting the " Component alpha not supported with source alpha and source value blending" messages
otaylor: arekm: that message doesn't actually indicate a software fallback, just a hardware fallback to a two-pass method of drawing text (where the second pass ends up doing nothing for black text, but that's a different optimization...)
edgecase: interesting, it looks like PCI bridges for parallel PCI, AGP, and PCI-e all have a RESET bit in their bridge control registers
arekm: turned off fallback debugging
edgecase: setpci doesn't seem able to affect it tho
agd5f: otaylor: looks good on r100 as well: http://www.botchco.com/alex/otaylor/agd-r100-repeats.png
otaylor: arekm: I'm not actually that satisfied with the text performance on r300 ... it's a lot slower than on i965 and I'd expect them to be about equivalent in gpu power. But I'm not sure what the bottleneck is or even really how to investigate that
otaylor: agd5f: Hmm, I think we have issues
otaylor: note the second one down on the left side of the upper right block
otaylor: and also the second and third ones down on the left side of the bottom right block
otaylor: agd5f: Looks like width 4 textures are giving us issues
agd5f: otaylor: ah yes
otaylor: agd5f: on the r100 are pixmaps going to be padded out to 32-byte widths or 64-byte widths?
agd5f: otaylor: I think it's the same for all radeons, but I'll check
otaylor: agd5f: one explanation for problems for width 4 is it's the removed texture pitch check, but if we are 64-byte padded, then that would be the same for width 4 (first column) and width 8 (second column)
agd5f: otaylor: yeah they all pad out to 64
otaylor: agd5f: but a pitch problem doesn't make sense really ... that wouldn't explain why larger sizes at the same pitch come out perfectly
otaylor: agd5f: Can you try simple-repeat.py and repeats-4-8.py ?
agd5f: yup
otaylor: agd5f: This actually looks remarkably like Revellion's problem, which makes little sense to me, since I *thought* that we are going through entirely different rendering paths here. But maybe my intuition is wrong.
agd5f: otaylor: simple-repeat looks fine
eboettcher: oh, good news for me: I printed off the r500 acceleration guide at school (somehow not getting in trouble for it)
otaylor: eboettcher: You better hope that there's not another revision :-)
agd5f: otaylor: http://www.botchco.com/alex/otaylor/agd-r100-simple-repeat.png
eboettcher: otaylor: there's plenty of computers around for me to look at the revs on them
otaylor: agd5f: You know that you can screenshot a single window with alt-printscreen?
eboettcher: having the print copy can serve as a refence for the general idea :)
agd5f: otaylor: good to know :)
otaylor: agd5f: Yeah, that looks fine, which is more correspondence with Revellion's issue, which didnt' show up there, but did show up on repeats-4-8
agd5f: otaylor: on 4-8.py, it renders correctly sometimes depending on how many times I hit r
otaylor: agd5f: OK. So this is good (sort of) since it seems unexpectedly like we've reproduced the other problem :-)
agd5f: otaylor: http://www.botchco.com/alex/otaylor/ r100-*
otaylor: agd5f: So the next question is on 4-8.py, exactly how it's being rendered... you certainly don't need me to step-by-step through how to do that. I was looking at things by running my test program in a bare X server, attaching it to it with a debugger, breaking in exaComposite, hitting r and going from there, but you likely have your own methods :-)
otaylor: agd5f: if you want to pick this up at some other time, of course, just let me know.
agd5f: otaylor: sounds good. I'll take a look
eboettcher: on tuesday I can start messing with this r500
eboettcher: (I decided that after 5 years I should upgrade so I got a phenom, since I needed a PCI-E board)
Magnade: otaylor: btw i tried out your patchs on a r300 and after adding a define in the source for the pixman format a call i found its working fine and do notice some speed ups
otaylor: Magnade: for the PIXMAP_FORMAT_A call? what version of pixman do you have?
Magnade: otaylor: at the time i didnt have any installed -dev package wise at least
Revellion: back
otaylor: Magnade: How did it compile at all then....
otaylor: looks
Magnade: otaylor: i was midly curious about that myself it did give a warning
Magnade: otaylor: thats only reason i knew to go looking for it
otaylor: Oh, that's what agd5f meant about the pixman macro :-)
otaylor: Magnade: OK, so yeah, probably it won't compile unless your xserver headers pull in pixman headers. I was thinking that the driver was using pixman constant already, but it's actually using the corresponding server constants
otaylor: Magnade: They are used interchangeably in the current tree, so using the pixman macro is legitimate, to an extent, but maybe not the best choice
Magnade: otaylor: as i dont know the code base well i cant say if using the macro is a good thing or not but it could use a check in configure for it
otaylor: Magnade: I'll figure out a fix
otaylor: Magnade: What xserver version are you building against? 1.3?
Magnade: yeah log says 1.3
Magnade: ubuntu gutsy version
otaylor: Magnade: So the best fix I think is just to replace PIXMAN_FORMAT_A with PICT_FORMAT_A
otaylor: Magnade: actually, if you can test that and see if it compiles, I'd appreciate it
Magnade: otaylor: i dont see any warnings being spewed
otaylor: Magnade: Cool. Useful information :-)
agd5f: otaylor: yeah PICT_FORMAT_* should do the trick
agd5f: otaylor: found the problem
agd5f: accelerated DFS
otaylor: agd5f: ah, so for R200 it was a xorg.conf difference?
agd5f: otaylor: I haven't tested again on r200, but on r100 I get bad rendering on the 4-8 test with with DFS on, and works fine with DFS off
agd5f: haven't gone any further than that yet
otaylor: agd5f: offhand I'm not sure what DFS is
agd5f: otaylor: Download from screen
agd5f: accelerated FB reads
otaylor: agd5f: So that means that the 4,8 is going through software....
otaylor: agd5f: My theory was that something was goign wrong screen=>host yesterday, but I wasn't expecting that to be triggered here
otaylor: agd5f: Do you have any idea why it's hitting fallback path?
agd5f: otaylor: on agp they are notoriously problematic. usually they work fine on PCI
otaylor: So it seems to be in particular a problem with *small* pixmaps
agd5f: yeah
otaylor: <= 1024 bytes say
otaylor: (and the smaller the more problematical)
agd5f: otaylor: there's even at note about problems with small transfers in the SFS code in radeon_exa_funcs.c
agd5f: *DFS
otaylor: agd5f: Also note that it's the top of the pixmap that's wrong.. which sounds like it is what that is talking about
agd5f: yup
otaylor: that note is talking about ... the memcopy isn't waiting for the blit properly so it starts reading immediately before the data gets there
otaylor: So I'm not going to be able to provide useful input on AccelDFS :-) .. do you want to briefly look at why DownloadFromScreen is being called at all for 4-8.py ?
agd5f: otaylor: yeah, I'll try and see what's happening
otaylor: Revellion: OK, it sounds like we know where your issue is coming from, if not how to fix it other than turning off the AccelDFS option
arekm: btw. dfs on pci-e is like pci (fine) or like agp (problematic) ?
agd5f: arekm: should be same as pci
agd5f: but we seem to have found an issue anyway, at least for small transfers
Revellion: otaylor: which is very unlikely to be a long-term solution :)
Revellion: but it would probably remove the distortions for the moment
otaylor: agd5f: Could it actually be a memory coherency issue? The gpu thinks it's idle, but the writes arent' yet visible to the processor?
otaylor: (OK, I probably should not idly speculate about what I know nothing about)
otaylor: Revellion: Well, long term, we'd hope that the amount that you have to DownloadFromScreen is minimal :-)
Revellion: otaylor: indeed :)
Revellion: would be quite nice to see transfers going from card->CPU and back forth be less. but that's not today ^^
otaylor: Revellion: With my patches you should getting almost none in normal desktop operation unless you are using bitmap fonts
Revellion: otaylor: hmm. i don't even think i have any bitmap fonts in my fontpath at all :|
Revellion: pretty much TrueType fonts only XD
AGP_Spec: anybody know if FastB2B+ if the same as FW+ for the AGP bus?
otaylor: Revellion: You'll get a few from FF3 because of the way it uses GTK+ widgets
Revellion: aah
Revellion: i'd say it's more of a FF3 being bad in that case :)
Revellion: than the drivers underlying it :)
otaylor: it renders them into a pixmap, then copies them back into an client-side image, then writes them to the screen again
otaylor: Revellion: Yeah, it's firefox being hacky anyways
Revellion: mhm
Revellion: i'd love for em to be more abstract in more clever ways than trying to make itself into a sofisticated drawing program drawing fake GTK UIs :P
otaylor: Revellion: Actually, you'll see a lot more with the official builds of FF3 right now, because it doesn't trust render repeats and uses a hacked internal version of cairo, but we're working on getting that fixed
Revellion: sort of like the VideoLan player is doing, having different UserInterface backends :\
Revellion: GTK and GTK
Revellion: otaylor: hmm, would it build properly with an external cairo?
Revellion: since at build i recall it being possible to enable
Revellion: but ofcourse official builds will still suffer
otaylor: Revellion: Right thing to do is to write a theme system that's independent of GTK+, and call that from a shim GTK+ theme... but that's long term
Revellion: mhm
Revellion: well patience is a virtue i suppose :D
Revellion: i'm quite pleased to see r200 with EXA and scrolling in the FF3 not being as slow as it used to be
otaylor: Revellion: Yes, you can enable system cairo explicitly. And actually the "offical" builds aren't going to be "official" for FF3 .. official is distro builds, binaries on firefox.com are just snapshots
Revellion: feels on par with XAA
otaylor: Revellion: If you are noticing slow scrolling in gmail, btw, that's something else. Up to very recently ff3 was rerendering the entire page on scroll.
Revellion: if not better.. but i have no benchmark :)
Revellion: hmm
Revellion: re-rendering it on scroll would indeed cause slowdown :S
Revellion: otaylor: which build-date did that behaviour change?
Revellion: otaylor: holy damn.. you are right it's horribly slow to scroll..., and afaik i havent really gotten the latest FF3 build installed
otaylor: Revellion: https://bugzilla.mozilla.org/show_bug.cgi?id=424915. Hmm, looks like it hasn't actually landed yet
Revellion: nasty bug
otaylor: (also https://bugzilla.mozilla.org/show_bug.cgi?id=382392, but the gmail slowness is just silly. They were rerendering the entire page because there were fixed position *hidden* iframes)
Revellion: otaylor: hmm temp workaround seems to be...
Revellion: use old gmail UI
Revellion: hidden iframes..
Revellion: smells a hack
otaylor: Revellion: a pretty typical sort of hack for advanced ajax web pages
Revellion: yeah
Revellion: hmm ill use ui=1 for now
Revellion: till the newer "fixed" builds are out
Revellion: never noticed that slowness before though. but then i rarely keep my inbox too cluttered
Revellion: most are in archive and labeled
Magnade: sounds like ill stay away from firefox3 for a while longer then :P
eboettcher: woah.. I start evince and the secondary display is turned off..
otaylor: Hmm, so my i965 is roughly 17x faster at drawing 10 character strings than my R300. Something is fishy there.
Magnade: otaylor: ouch
Magnade: would that in part explain why in gtkperf textbox item is so painful?
otaylor: Magnade: could be. I've never trusted gtkperf a whole lot
Magnade: i965 is shared memory also isnt it rather than dedicated memory like a r300 prob is
eboettcher: Magnade: the mobile r300s can use shared memory
mcgreg: hey guys how do you do those tests?
otaylor: Magnade: That could be the root cause of the problem, but it's not inherent, since I'm just drawing the same glyph over and over again millions of time
otaylor: Magnade: so we coudl definitely download it to video memory and leave it there
otaylor: mcgreg: I write little bits of pycairo code :-)
Magnade: the key i was getting at is the shared memory should be slower
Magnade: so that 17x figure could be considered even larger
otaylor: Magnade: Considering the generatin difference, I think the cards are roughly equialent in raw horsepower - in glxgears the r300 does 1900fps the i965 1700
otaylor: Magnade: But yeah, I certainly wouldn't expect the r300 to be the less powerful of the two offhand
otaylor: (I guess "chips" would be more accurate, since neither of these is a discrete card...)
Magnade: right
arekm: wonders what's horsepower of intel X3100 in comparsion to r300
otaylor: arekm: That's the comparison I'm making
otaylor: arekm: I think.. Not that up on all the numbers
arekm: ah, so x3100 == i965? I was thinking that's something newer 8)
Magnade: looking at wikipedia it looks like thats the case arekm
otaylor: arekm: well, the 965 is a family. This is a "GMA 3100" to be specific. Not sure if that's the same as a X3100, or that's a newer family
otaylor: arekm: But afk, the 965 is the last released family from intel
Magnade: the wikipedia item did go to the gma 3100 btw
Magnade: otaylor: ping me if you have something you want tested on r300 optimization wise btw :)
Magnade: bbl
OipOS: Oh btw, I lost my hardware acceleration(r300) when I upgraded mesa/xserver/ati to git today...
OipOS: Here's the output of glxinfo: http://pastebin.com/d271c7bbd
agd5f: otaylor: I don't seem to be hitting exaComposite() at all with repeats-4-8.py
otaylor: agd5f: Oh, cairo version?
agd5f: otaylor: whatever shipped with ubuntu feisty
otaylor: agd5f: pkg-config --modversion cairo will tell you. BUt hmm, this makes the test results a bit uncertain. May not actually have been testing the new code at all
agd5f: yeah
otaylor: agd5f: There's a hack in cairo to say "for old X servers, dont' use repeats", but up until 1.5.
otaylor: agd5f: Hmm. trying to think of the easiest way to get a good test. I don't want to rewrite in raw XRender.
agd5f: otaylor: looks like 1.4.2-0ubuntu1.3
otaylor: agd5f: It's pretty easy to use jhbuild to build an isolated pixman and cairo, let me see if I can give you a couple line recipe
agd5f: otaylor: I don't really follow cairo too closely. should I rebuild cairo?
agd5f: I'm already using the latest pixman or there abouts
otaylor: agd5f: OK, if you have a system to build pixman, just replicate that for cairo :-)
agd5f: otaylor: will do
otaylor: agd5f: pixman is the only dependency other than X libraries and libpng, for which anything will do
agd5f: sounds good
agd5f: otaylor: that fixed it
agd5f: well, it hits exaComposite now anyway
agd5f: time to retest
otaylor: agd5f: Yeah, now for the $10^6 question
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-repeats.py.png
agd5f: redraw seems to have the same affect as before on the 4 ones
otaylor: agd5f: Are you sure that that's the right image?
agd5f: yeah
otaylor: agd5f: That really looks like the DSF path for 4x8
otaylor: And it's linking against the new cairo?
otaylor: (check /proc/
agd5f: sorry, wrong cair new image coming
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-repeats.py.png
otaylor: OK, I'm not surprised by breakage, I just didn't expect the exact same breakage :-)
otaylor: agd5f: So, it looks like a) removing the pitch tests was wrong b) something is going badly wrong in the tiling case
otaylor: (the solid colored squares are the ones are where we are drawing in tiles)
otaylor: agd5f: Can you hit '1' (turning off repeats) and screenshot that?
agd5f: yes
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-1-repeats.py.png
zero7d: hi all
otaylor: Oh, I know what's wrong with the tiling
eboettcher: O.O; what is the point of US_OMOD_U1?
zero7d: klayway 10.5.1 deriver support
otaylor: agd5f: I'm not tuning on RADEON_TXFORMAT_NON_POWER2
zero7d: anyone here
agd5f: otaylor: I'll change that
otaylor: if (pPitch->repeat && !(need_src_tile_x || need_src_tile_y)) I think
zero7d: can anyone elp me
zero7d: install my klayway ati driver in my pc
otaylor: zero7d: You are in the wrong channel ... this channel is about Linux
zero7d: ok
otaylor: (to first approximation, apologies to any BSD folks)
zero7d: can you tell me the channel
zero7d: that ivse to be in
zero7d: have
otaylor: zero7d: I don't know. There may not be one on this server
zero7d: ok thx
eboettcher: I would have tried sending him to ##
eboettcher: I see, US_OMOD_U1 can be used when one wants the output to be clamped as this is not done with US_OMOD_DISABLED
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-1-repeats.py.png and http://www.botchco.com/alex/otaylor/r100-repeats.py.png
otaylor: Hmm, that would imply that the NON_POWER2 of flag wasn't it
otaylor: agd5f: Do you see any other place where we are checking pPict->repeat and not needs_src_[x,y]_tile?
agd5f: unless I patched it wrong
otaylor: agd5f: I'm working on a revision that is the way I think it should look
otaylor: agd5f: OK, new set of fixes pushed to my git repo, will upload it to the bug report in a second
agd5f: otaylor: cool
otaylor: (I'll merge all of these together if we actually get it working...)
agd5f: pulling now
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-repeats.py.png
otaylor: agd5f: That looks good
otaylor: agd5f: Hopefully it's not looking good because some stuff is getting fallbacks....
agd5f: some stuff is
agd5f: apparently
agd5f: hitting r breaks some of the 4 ones some of the time
otaylor: agd5f: If you pulled my git tree, you should have fallback logging on, can you see if anything gets logged when you hit r?
agd5f: nothing seems to be logged
otaylor: agd5f: right cairo?
agd5f: I think so, lemme check
agd5f: otaylor: yup. it's using the right cairo
otaylor: agd5f: hmm. something must have gone wrong with my push or your pull, since embarresingly, my current version doesn't compile...
agd5f: are you on a banch?
agd5f: I just used master
otaylor: agd5f: Yeah, this is the npot-repeat branch, sorry
agd5f: whoops
otaylor: agd5f: master is just master
agd5f: heh
otaylor: Let me fix this before you repull :-)
agd5f: yup
otaylor: OK, compilation fix pushed
agd5f: compiling
agd5f: otaylor: similar to previously: http://www.botchco.com/alex/otaylor/r100-repeats.py.png
otaylor: agd5f: Well, we've taken down one problem
otaylor: agd5f: It looks like the texture coordinates are entirely wrong when repeating or something like that, since note that we're getting the upper left hand color as a constant color
otaylor: when tiling, that is
agd5f: yeah
agd5f: otaylor: should we be using pPix->drawable.width and height?
otaylor: agd5f: are they ever going to be different?
agd5f: I don't think so, but I'm not really an expert
otaylor: My assumption was that the pitch might be different (from the screen download) but the size would always be the same
agd5f: sounds reasonable
otaylor: agd5f: OK, I see one bug, but it should only affect repeating masks, and it's the same on r300
otaylor: (going to have to add repeating masks to my test case)
otaylor: agd5f: One quick experiment, can you try:
otaylor: @@ -379,9 +379,13 @@ static Bool FUNC_NAME(R100TextureSetup)(PicturePtr pPict, PixmapPtr pPix,
otaylor:
otaylor: if (pPict->repeat && !need_src_tile_x)
otaylor: txfilter |= RADEON_CLAMP_S_WRAP;
otaylor: + else
otaylor: + txfilter |= RADEON_CLAMP_S_CLAMP_BORDER;
otaylor:
otaylor: if (pPict->repeat && !need_src_tile_y)
otaylor: txfilter |= RADEON_CLAMP_T_WRAP;
otaylor: + else
otaylor: + txfilter |= RADEON_CLAMP_T_CLAMP_BORDER;
otaylor: It's not going to fix it .. but it might make clear what is going on... whether we are getting texture coords wrong or texture size wrong
agd5f: otaylor: yep
agd5f: otaylor: http://www.botchco.com/alex/otaylor/r100-test-repeats.py.png
agd5f: looks similar to the previous
otaylor: agd5f: That probably means it's not texture coordinates, but how we are setting up the textures, since if we were sampling off the edge, I'd expect CLAMP_BORDER to give whatever random color hapened to be in the border color register
otaylor: agd5f: Wait, no CLAMP_BORDER isn't the right value for that... I think I meant CLAMP_GL
otaylor: agd5f: But it's low value information at best ... I think I'm going to declare myself stumped for the evening
otaylor: agd5f: maybe sitting on it will yield inspiration, if not I can probably borrow board from ajax/krh
agd5f: otaylor: no worries. thanks for looking at it
otaylor: agd5f: I'm pretty positive it has to be something really simple, but not obvious what
otaylor: agd5f: Thanks for your help .. I feel that I'm only a few characters from having a working patch on all chipsets now
agd5f: those are always the hardest to track down :)
airlied: otaylor: the r300 should kick ths 965's ass mostly..
airlied: otaylor: I'm not sure yet why we don't :)
agd5f: otaylor: looks pretty much the same with CLAMP_GL: http://www.botchco.com/alex/otaylor/r100-test-repeats.py.png
otaylor: airlied: well the 965's exa text rendering was dead-slow too until cworth tackled it, so maybe there is something systematic in exa that requires driver-by-driver attention
damentz: hey agd5f, i'm using this tool called grandr to cloen a monitor on my laptop
otaylor: airlied: The first indication is that it looks bandwidth bound - I get the same number of 10x10 characters/sec if I draw them 1 at a time, or 50 at a time
damentz: but i can't extend it, it complains about setting the screensize larger than max screen size (ya redundant)
damentz: do i have to specifically add xorg.conf options or something?
agd5f: damentz: you need to set a larger virtual size in your config
damentz: ahh
damentz: ok
otaylor: Which is very different from the 965 where that goes from 80,000/sec to 600,000/sec depending on the chunking
damentz: this would be great when i'm connected to a projector
damentz: do work on my laptop screen and present on a projector
otaylor: (it's a constant 15,000/sec or so on my r300)
spstarr: waits for the patch factory to spit some new patches out
airlied: otaylor: yeah mine goes at 17000 and aa10text vs aa24text isnt majorly different
airlied: we appear to be waiting for the GPU a lot.
otaylor: airlied: It looked to me we were mostly waiting for the gpu in terms of the output ring getting full and not to download for screen or anything
edgecase: airlied, looking at chipset docs, and experimenting with setpci, it seems PCI and AGP bus resets aren't implemented ;<
edgecase: i suppose each GPU could implement a reset of large parts of itsself, say with a PCI config space register
edgecase: still waiting to see about pci-e, i'm still hopeful it can be reset with a big hammer
airlied: otaylor: yes, I've no idea why its so bad, need to try decreasing the state emission to see if that helps.
airlied: otaylor: but it may be a wait or flush in the wrong place.
otaylor: airlied: Oh wait, we are emitting the state per glyph. Yuck!
airlied: I noticed the DRM does a 3D flush on indirect buffer submission that may not be necessary. but removing it didn't help
airlied: otaylor: yes, we have no state caching et..
airlied: yet
airlied: and composite doesn't lend itself to batching.
otaylor: airlied: might help a lot to overload ->Glyphs and Prepare only once for the whole glyph string
otaylor: airlied: batching all the glyphs in a single protocol request should be easy
airlied: otaylor: I think cworth was meant to look at the for RH..
airlied: otaylor: but not sure if he ever actually did anything.
airlied: otaylor: it might be worth asking him..
ajax: airlied: he looked at it for 965 but found that it wasn't a super-huge win there, iirc.
airlied: polycomposite it was called I believe.
otaylor: ajax: state submission might be smaller. We are setting up a shader for every Composite call right now
airlied: otaylor: at least on radeon state setup isn't as bad as on intel 965.
otaylor: ajax: I wouldn't even be surprised if we caused a pipeline stall
ajax: airlied: yeah, i still think proper ->Glyphs wrapping in exa would be a win
otaylor: airlied: I don't really have the background to say what "bad" is, but R300PrepareComposite is not a short function
airlied: otaylor: oh I know we could decrease it, at least for strings of operations where the operation doesn't change
agd5f: we should reserve some shader space and keep some common programs loaded then we can access them by offset
airlied: agd5f: on the r500 true.. r300 not so much.
agd5f: yeah. right
airlied: I do wonder though if the upload overhead for a small glpyh is too high
airlied: like for UFS'ing them
otaylor: airlied: I think you could still squeeze in 10 or so of the shader programs on r300
otaylor: airlied: I think they are just pixmaps that should get migrated in
otaylor: (not very sure about that though)
ajax: DMA setup for UTS might be prohibitive compared to just memcpying them in.
ajax: assuming we do DMA. i think we do though?
airlied: the other thing is that glyphs probably need to go into a big caching pixmap at some point.
airlied: because eventually we'll be wasting RAM..
ajax: eh. 4k is 32x32x32bpp.
ajax: that's not a big glyph.
airlied: ajax: lots of them are 8bpp though..
ajax: airlied: you know people who don't use subpixel aa? i'm shocked.
airlied: and its more the wasted space thoguh between aligned glyphs, and the overhead of having so many small objects
airlied: ajax: I still run xemacs but I'm sick like that.
airlied: ajax: (I don't run aa though at all :)
ajax: i'm more sold on having fewer pixmaps as an end in itself than on the vram savings ;)
agd5f: ajax: speaking of which, where's shatter? :)
ajax: it morphed into about five different versions over time
otaylor: airlied: well if aa text is slow with exa, than non-aa text... hmm, is probably faster because the pixmap get migrated out at the start and it's all software
ajax: i desperately need to merge that and push it somewhere, i know
otaylor: Updates to shader code outside the currently active program are safe, and do not stall the pipeline. If you intend to
otaylor: overwrite the active shader, however, the pixel shader pipe must be flushed so that pixels running the old shader get
otaylor: out before the update. Register writes to US_CODE_ADDR, US_CODE_RANGE, US_CODE_OFFSET, and/or
otaylor: US_PIXSIZE should flush the pixel shader pipe.
otaylor: I wonder how much performance damage we are causing on R300 by writing to those registers for every glyph.
otaylor: "flush the pixel shader pipeline" on the other hand might not be much of a performance penalty, it presumably doesn't imply flushing the whole graphics pipeline
edgecase: airlied, intel ICH7 pci-e 1x ports, support Secondary Bus Reset (SBR) — R/W. This bit triggers a hot reset on the PCI Express port
airlied: edgecase: yeah I'd expect pcie to get it right more often.
edgecase: yeah. i was pondering over the weekend, the whole ISA-VGA decode mess, I think PCI-SIG (ie intel) dropped the ball, they weren't planning for legacy removal at the point where they introduced PCI bus
edgecase: the VGA_EN bits were to support ISA vga cards on ISA secondary bus,
edgecase: but they stopped short of defining an alternate PCI-configurable VGA register window in the PCI spec.
airlied: edgecase: the market at the time didn't allow dropping in..
airlied: edgecase: too many legacy apps and OSes and hardware
edgecase: so the market made PCI cards, that had ISA io port ranges hardwired, forever sealing our doom
edgecase: sure, the solution is, let the BIOS program the PCI gfx cards VGA range, to 0x3bc etc
edgecase: but instead there's a special bit VGA_EN, to hardwire
edgecase: i guess hindsight is 20/20
edgecase: it's sad they didn't try to correct it with pci-e
airlied: edgecase: what about two BIOSes? how would they cooperate?
edgecase: well a normal PCI rom, the bios configures the BARs, feeds the bus/dev/fn to the ROM and runs it,
edgecase: the rom bangs the ports of it's own device, exits, BIOS moves to next slot
edgecase: so if the VGA mem/ioport range was just a regular set of PCI BARs, there would be nothing special about graphics cards
edgecase: if the bios happened to set *one* of them to io/mem 0x3bc/0xa0000, that's up to the bios's PCI setup code
edgecase: the rom gets run with its bus/dev/fn, minding it's own device and io/mem ranges set in the BARs
airlied: edgecase: its not really much different than the VGA_EN bit though is it..
edgecase: instead, we have this "magic" PCI programming interface 0/3, with an *implied* BARs for io/mem, that aren't programmable. programmable BARs are the cornerstone of PCI
edgecase: well the devil is in the details, it's a huge difference
airlied: edgecase: but the BARs are well defined..
airlied: edgecase: just not explicitly in every device
edgecase: except for VGA cards, they're grandfathered to ISA legacy way
edgecase: there is no BAR for vga range
airlied: edgecase: the ranges are well defined
edgecase: it's always the ISA range, hardwired
airlied: edgecase: it would be pointless being any other range
edgecase: being defined isn't important, being *programmable* is
airlied: as the whole point is ISA compat.
edgecase: well i386 has this MMU that can map things,
airlied: at a virtual level.. not a physcial one, at least mine doesn't.
edgecase: but really not necessary, with programmable BARs, DOS can bang the ports of *one* VGA adapter all day long
edgecase: *and* having programmable BARs, lets us *AVOID* atrocities like RAC
edgecase: and VGA arbiter
airlied: edgecase: and where have you got the address space in 16-bit for this?
edgecase: same place every other PCI IO BAR shares...
edgecase: seems to be enough with SCSI adapters and all kinds of other stuff installed
airlied: edgecase: they are IO BARs what abuot MMIO
edgecase: same deal
airlied: two GPUs would really leave no space in 16-bit.
airlied: SCSI used very small MMIO ranges.
edgecase: MMIO is in 4G space
airlied: edgecase: not in 16-bi
airlied: which the BIOS is writtein in
edgecase: again, 1 card only, gets mapped to 0xa0000
edgecase: the others, are not special, they follow same rules as other adapters
airlied: edgecase: it doesn't matter, to have the BIOS POST the card it needs to be mapped soemwhere
airlied: so it needs to be in 16-bit land.
edgecase: that's normal, *every* PCI adapter does that, every time system boots,
airlied: edgecase: every other adapter uses quite small ranges though from what I can see.
edgecase: well when i boot my system, and run lspci under linux, i discover the bios has mapped all kinds of things in 4G space
airlied: edgecase: that is different than having 16-bit code running on them
edgecase: sure, i'm talking about VGA legacy ranges, which are 32 registers range
edgecase: to reiterate, i'm merely saying, PCI spec could have been designed to allow DOS to bang one VGA adapter, in legacy location
airlied: edgecase: it was.. VGA_EN.
edgecase: while letting the others act like regular PCI devices, and not requiring RAC/VGA arbiter in case of multiple cards
airlied: it can..
airlied: edgecase: you turn off the VGA decode bits on the other adapters
edgecase: VGA_EN is specifically what requires us to use RAC
edgecase: you can't
edgecase: you can only disable the card entirely with IO_EN and MEM_EN
airlied: edgecase: you can, we are going to have cards remove themselves from arbittation
airlied: edgecase: all AMD cards have bits to stop dceoding the VGA rangse.
edgecase: mark the legacy ranged as unused...
airlied: they hardware can be told don't worry about legacy, and then can be removed from RAC/arbitration
edgecase: sure, i look forward to it,
edgecase: i'm just saying, PCI spec *could* have made this all possible, 20 years ago, with one subtle change
airlied: edgecase: I don't think you've thought it through properly.. but really it doesn't matter.
edgecase: and would have still enabled things like early printk on vga console
edgecase: yes, can't change history
edgecase: what we're doing breaks DOS compat, which is ok for us, but can't be pushed into mass market
edgecase: pci-e gets us a few get out of jail free cards,
ajax: what are you on.
edgecase: my idea solves it for 2 PCI VGA cards in same bus also
edgecase: on a rant?
ajax: every card since like 1999 has had a bit somewhere in its MMIO space to turn off VGA decode.
edgecase: pci-e is point to point, so you can disable vga decode at each bridge
ajax: this isn't some new pcie thing
airlied: edgecase: you can do it on PCI as well.
edgecase: i find cards lacking the jumper at boot up
airlied: edgecase: you are seriously confused.
ajax: and? just because it's not present as a jumper doesn't mean you can't do it from software.
edgecase: without the VGA disable jumper, both cards would decode legacy range at bootup, no chance to program an MMIO register
airlied: edgecase: the BIOS deals with it
airlied: edgecase: and turns one off.
airlied: edgecase: lots of ppl have been running multiple PCI GPUs for years
edgecase: turns it off, but it's still useable dual-head, if you POST it yourself i guess
edgecase: but that requires RAC
airlied: edgecase: only for the POST.
airlied: edgecase: post-POST you can put one card into non-VGA mode
edgecase: yeah
edgecase: my rant is based on, "why did they perpetuate these hacks to begin with?"
airlied: or POST it without the BIOS.
edgecase: not, can it be done
airlied: airlied: we can POST radeons without the BIOS mostly.
airlied: oops talking to myself.
ajax: edgecase: because backward compatibility is the only market force that actually matters
ajax: also, this is far from being the worst bc hack in pci
edgecase: oh?
ajax: check out vga palette snooping sometime. then cut yourself.
edgecase: well nobody uses it right?
airlied: edgecase: anymore..
edgecase: ack
airlied: edgecase: it was used quite a lot at the time.
edgecase: and now all this stuff just wastes die space
airlied: edgecase: I doubt it wastes very much..
airlied: I doubt they ever touch it from chip to cihp.
edgecase: i suppose.
edgecase: i've probably wasted more collective energy just by this rant
edgecase: appologies for that ;<
airlied: considering the relative size of a 3D core and a PCI core..
edgecase: ok tks for the chat, i learned quite a bit. sometimes my curiosity demands things.
edgecase: i wouldn't be surprised if a few BIOSes got the dual PCI VGA thing wrong, and crashed or something
edgecase: i wish the backwards compat solution would have enabled early kernel msgs, alan cox is quoted "over my dead body" regarding removing VGA text console
edgecase: just collateral thought waves of mine, no need to comment
edgecase: so this is why some platforms have fbdevs, they don't *do* VGA legacy decode?
airlied: edgecase: yes nobody except x86
airlied: and maybe ia64
edgecase: offtopic, but is EFI firmware getting popular at all?
edgecase: i ask because it can pass framebuffer info to the bootloader/OS
airlied: edgecase: Apples.
edgecase: yeah
airlied: we have efifb already
edgecase: oh yeah? where at?
edgecase: for Xorg?
edgecase: or linux kernel
airlied: kernel.
edgecase: sweet
edgecase: time to go wake up all of linuxbios
airlied: hmm state emission optimising didn't seem to do much for r300 :(
airlied: we really must be doing some pathalogical with the hw.