Although the pipeline is already quite stable and was used for running Netbeans while it has been worked on it, there are still some areas where some additional work would at least not hurt ;)

1. XRender always interpolates image-edges with transparent when a scaling transformation is used:

Border of scaled Image

However Java2D requires sharp edges when scaling images, so there is a correctness problem here.
All visual (minor) differences I saw with nimbus are caused by this problem.

Cairo offers a solution for non-sheared images for their image-surfaces (software only), but not for XRender-Surfaces.
Looks like a shortcoming of XRender, support has been added recently but is not accalerated on most drivers.
Hopefully demand by cairo will improve the situation.

2. Transformed Blits's composition operation covers a too large area:

For now I simply take the transformation set on SunGraphics2D, invert it and apply it to the source image.

However this transformation has its origin at 0,0 and to get useful results my composition rectangle also has to start at 0,0.  Now if a 10x10 image is blited with rotation transformation applied at 300/300, the rectangle of the composition operation is about 0,0 - 310,310, with everything not belonging to the image clipped away.

I don't know if XOrg or the drivers are able to optimize this, but I am not happy with the current solution, especially the shape-clip when shear transformations are used.

3. Text-Rendering is not accalerated for TexturePaint if an AlphaComposite with extra alpha is used.
This could be accalerated if the Texture/source-image is opaque, and with some overhead its also possible to accalerate it for ARGB.

- The RGB-case could be accalerated by setting the "alpha-map" PictureAttribute to the 1x1 alphaMask which holds EA: "The alpha channel of alpha-map is used in place of any alpha channel contained within the drawable for all rendering operations."

- For the ARGB-case we could blit the source-texture into a tile with the 1x1 alphaMasp and use the result as source for the text-composition.

4. For now lines are always rendered into a mask using an X11-GC and later that mask is used for composition.

The advantage of this approach is that no validation has to be done on the "real" GC, and furthermore EXA-based driver's don't accalerate diagonal lines - so we guide them to not migrate back and forth which is extremly expensive if running without GEM or TTM.
However it also imposes quite high overhead for line-drawing (although in real-world application probably not as high as suggested by J2DBench), and it would be worth tuning line-drawing.

Currently if the mask is too small to cover an operation completly, tiling is used. For lines currently only the bounding-box is used for coverage analysis which often leads to unescessary large composited areas if the line is larger than one tile.

Another possibile optimization would be to limit the tile-size of the line-tile to 32x32, like its done for MaskFill.
For diagonal lines this would save quite a lot of fillrate, at the expense of higher setup/protocol costs.
(Blitting e.g. 8 32x32 tiles is probably cheaper than one 256x256 mask).

5. Improve Text Rendering and Glyph Caching:

Rendering Text currently draws all Glyphs submitted in one go.
This complicates GlyphCaching quite a bit and also stresses the X-Server's GlyphCaching if a lot of different Glyphs are used in a very long GlyphVector.
There is currently also an ugly hack because Java2D's GlyphInfo structure expectes to hold a pointer to a linked-list-element of this and that type, but its currently mis-used.

Maybe limiting the number of glyphs to 512 or 1024 could help here simplifying the code, and for 1024 glyphs in one go the setup-costs of the hardware-engine should be negligible.

6. Support for XOR-Composition:

Although XRender supports XOR rendering mode, its currently not accalerated by the XRender pipeline.
Shouldn't be too hard to support it ... altghough nobody uses it except some of my legacy applets ;)

7. Further correctness checks and testing:

The pipeline was developed with the goal to be useable for typical applications.
Many complex applications were tried and when bug was found it was fixed.
However there may be seldom used functionality which eventually could produce not the desired result.

8. Investigate possibility of accalerating BufferedImageOps:

The OGL/D3D pipelines accalerate some of the BufferedImageOps.
As far as I know XRender does not provide functionality which could be used to implement those, but I may be wrong.

9. Improve network bandwith efficiency for text:

Currently glyphs are always identified using a 32-bit ID, however XRender also provides functions for 8-bit or 16-bit IDs.
256 glyphs should be enough for most cases, and if 5. is implemented 16-bit should be enough always.

10. Implement Tracing:

For now the XRender pipeline has only rudimentary tracing support with many methods missing.
Extending it could help tuning, would give developers useful hints whats going on behind the scenes.

11. MaskFill/Blit Buffering:

For now the pipeline does MaskFill/MaskBlit with the usual 32x32 tiles, however buffering those tiles in the mask used by MaskBuffer could improve the performance of those operations quite a bit for larger areas.
In general a lot of stuff could be buffered later rendered in one go, as mentioned in "mid term goals".

12. Support solaris

The pipeline now builds on Solaris, however running in VirtualBox only gave empty windows, maybe it works on native installations.
Support for Solaris will be validated / implemented soon.