If you frequent audio sites you've probably come across listening tests and the criticisms of those listening tests. For some audiophiles only a double/blind, A/B/X testing procedure with multiple tests and subjects is a valid test and anything less is merely "a subjective opinion" with little or no validity. On the other extreme we have audiophiles who are able and willing to post extensive sonic analysis of components based on a few minutes in a public demonstration. Obviously there is vast gap between these two testing methodologies.
I have issues with both of these positions. On the one hand I feel that rigorous A/B/X testing that is designed to determine difference/no difference is too rigid and limiting for audio evaluation and at best it only tells you that there is or isn't a difference in a given test environment - it is not a universal test of sonic excellence. On the other hand any comparison that does not include some way to make sure that the relative output levels between two headphones, amplifiers, speakers, or whatever component is under test is identical has little value. Subjective listeners will always (unless the louder one is absolutely foul) prefer the louder source. I can't begin to count the number of high-end source components whose output levels were ever-so-slightly higher than their specification - their creators knew that would make them sound better in an unmatched level comparison.
What for me constitutes a valid listening test? First levels between any component under test must be matched. With some components, such as headphones, this can be a challenge. But using an SPL meter or SPL meter app isn't that hard - just don't expect your numbers to correlate with a standardized test rig. But if you are consistent you can get measurements that are accurate enough to serve for matching level purposes. Obviously having a volume control that has calibrations or numerical values will make level matching much easier, and some preamps that lack any form of level calibration may prove impossible to use for A/B tests because there is no way to repeatably match levels.
Once levels are reliably matched the next problem with listening tests is the variability of tester's source and source material. With a massively bad source or software you can make it so you couldn't tell the difference in an A/B test between $50 and a $5000 headphone! Start with an iPhone, add a 128 kbps MP3 and you've lowered the resolution level to the point that any test will be rendered useless. At the other price extreme sometimes the filter setting on the DAC (fast, slow, or minimum phase) or using one brand of cable over another can have a major effect on the overall sound of a system. If the listener is not familiar with the source and source components they can easily attribute a sonic characteristic to a component that should have rightfully been laid at the feet of a different component in the signal chain.
If you're thinking that my conclusion will be "Testing and subjective evaluations should be left to the trained professionals." you are wrong. I think that every audiophile who is serious about getting the best out of their system needs to conduct their own listening tests. But I do not think that A/B/X is the way to go. But your test does need rigorous attention paid to level matching. And if you are not familiar with the source and source material it will be very difficult, if not impossible, to arrive a any meaningful sonic information that can be applied in a universal manner. But if you do a listening test in your own system, you will find out what sounds best in that system. That's something.
Please do listening tests, but use some level of control - make sure your levels are matched and you are familiar with the other components in your signal chain. And don't expect that a five to ten minute listening test at a hi-fi show or head-fi meeting will give anyone, even those with golden ears, enough data to make an informed opinion on the ultimate sonic quality of anything!