You could just use openGL 2.0 / GLSL 1.2 instead. All the cards still support it and I highly doubt they'll stop anytime soon.
====
just checked wikipedia. As someone who recently had to rewrite 22 shaders from GLSL to nvidia CG, I wish I had originally done them the "new" way, cause it would have made it a lot easier. GLSL seems a little bit odd in the way you specify certain inputs (and looking back I have no idea how half my shaders worked), whereas nvidia CG (and HLSL since they're the same) let you specify certain registers to be associated with your variables (float4 pos : POSITION, float4 texcoordin : TEXCOORD0, etc).
For passing the vertices in, vertex buffer objects are your best bet in any version of opengl. I haven't found a situation where a vertex buffer is slower than a display list, immediate mode, or glDrawArrays.
Is there some kind of "OK, start the shaders" functions that I'm overlooking? The red and orange books seem awfully keen to show shader examples, but client-side code for actually using the damn things is strangely lacking.
this was a bit tough to find, but the process goes like so:
1. compile the fragment shader and vertex shader (you can compile as many files as you want actually like if you had a library you used with all your shaders, as long as you only have 1 entry point for fragments and 1 entry point for vertices)
see: glCreateShader, glCreateProgram, glShaderSource, glCompileShader
2. link the shaders into a program object
see: glAttachShader, glLinkProgram
3. when you want to use your shader, call glUseProgram. glUseProgram(0) goes back to default
4. get uniform locations with glGetUniformLocation
5. set uniforms with glUniform{1|2|3|4}{f|i}(v)
6. delete shit when you're done with it
the opengl documentation should be enough for those functions.