I'm using a gaussian blur fragment shader.  In it, I thought it would be concise to include 2 subroutines: one for selecting the horizontal texture coordinate offsets, and another for the vertical texture coordinate offsets.  This way, I just have one gaussian blur shader to manage.
Here is the code for my shader.  The {{NAME}} bits are template placeholders that I substitute in at shader compile time:
#version 420
subroutine vec2 sample_coord_type(int i);
subroutine uniform sample_coord_type sample_coord;
in vec2 texcoord;
out vec3 color;
uniform sampler2D tex;
uniform int texture_size;
const float offsets[{{NUM_SAMPLES}}] = float[]({{SAMPLE_OFFSETS}});
const float weights[{{NUM_SAMPLES}}] = float[]({{SAMPLE_WEIGHTS}});
subroutine(sample_coord_type) vec2 vertical_coord(int i) {
    return vec2(0.0, offsets[i] / texture_size);
}
subroutine(sample_coord_type) vec2 horizontal_coord(int i) {
    //return vec2(offsets[i] / texture_size, 0.0);
    return vec2(0.0, 0.0); // just for testing if this subroutine gets used
}
void main(void) {    
    color = vec3(0.0);
    for (int i=0; i<{{NUM_SAMPLES}}; i++) {
        color += texture(tex, texcoord + sample_coord(i)).rgb * weights[i];
        color += texture(tex, texcoord - sample_coord(i)).rgb * weights[i];
    }
}
Here is my code for selecting the subroutine:
blur_program->start();
blur_program->set_subroutine("sample_coord", "vertical_coord", GL_FRAGMENT_SHADER);
blur_program->set_int("texture_size", width);
blur_program->set_texture("tex", *deferred_output);
blur_program->draw(); // draws a quad for the fragment shader to run on
and:
void ShaderProgram::set_subroutine(constr name, constr routine, GLenum target) {
    GLuint routine_index = glGetSubroutineIndex(id, target, routine.c_str());
    GLuint uniform_index = glGetSubroutineUniformLocation(id, target, name.c_str());
    glUniformSubroutinesuiv(target, 1, &routine_index);
    // debugging
    int num_subs;
    glGetActiveSubroutineUniformiv(id, target, uniform_index, GL_NUM_COMPATIBLE_SUBROUTINES, &num_subs);
    std::cout << uniform_index << " " << routine_index << " " << num_subs << "\n";
}
I've checked for errors, and there are none.  When I pass in vertical_coord as the routine to use, my scene is blurred vertically, as it should be.  The routine_index variable is also 1 (which is weird, because vertical_coord subroutine is the first listed in the shader code...but no matter, maybe the compiler is switching things around)
However, when I pass in horizontal_coord, my scene is STILL blurred vertically, even though the value of routine_index is 0, suggesting that a different subroutine is being used.  Yet the horizontal_coord subroutine explicitly does not blur.
What's more is, whichever subroutine comes first in the shader, is the subroutine that the shader uses permanently.  Right now, vertical_coord comes first, so the shader blurs vertically always.  If I put horizontal_coord first, the scene is unblurred, as expected, but then I cannot select the vertical_coord subroutine! :)
Also, the value of num_subs is 2, suggesting that there are 2 subroutines compatible with my sample_coord subroutine uniform.
Just to re-iterate, all of my return values are fine, and there are no glGetError() errors happening.
Any ideas?