Skincare is a multibillion dollar industry with anti-wrinkle being one of its largest and fast growing segments. Unlike drugs, it’s not regulated by FDA and manufacturers are not legally required to go through thoroughly regulated clinical trials to prove the efficacy of their cosmetic formulations.
As a matter of fact, most of the formulas or ingredients marketed as anti-wrinkle, lack substantial evidence of efficacy. But the claims of efficacy are sound, and they are often based on what some call “marketing studies” – a study predesigned to confirm the efficacy of a marketed ingredient or formula. Such studies are very far from what is considered a gold standard in clinical research: double-blind, randomized, placebo-controlled study.
Raw materials producers often sponsor or arrange such studies to persuade consumers and manufacturers about the efficacy of specific ingredient they supply. Manufacturers of cosmetic products do the same either with ingredients itself (often the ones their R&D departments develop) or ready cosmetic formulations that combine several ingredients.
PubMed is full of such “marketing studies” “proving” that a certain ingredient is effective in reducing wrinkles. Such studies however often lack fundamental base that makes a research conclusive and pre-designed to allow a large room for bias.
But most of regular consumers probably don’t even know about Pubmed and left face to face with the skincare industry that spends multimillion dollar marketing budgets to persuade them about efficacy of their products.
Imagine, as a consumer, if skincare manufacturers, before making any claims, needed to prove the efficacy of their products to FDA the same way the drug industry. Would you be happy? We think yes! Because that simply means that you wouldn’t have to spend money (and time) on things that don’t work and get the desired results from proven products. Probably many anti-aging products would have to leave the market or at least remove their claims about anti-wrinkle efficacy. As it’s not the case today, we are trying to give consumers a tool to navigate through this vague territory and understand what ingredients are proven to work. But that’s not the only question.
The obvious anti-wrinkle treatment proven by many studies since 80s, recommended by dermatologists and by the way, super cheap and effective, is a prescription Tretinoin (Retin-A, Renova). It has some side effects like skin irritation but it’s still the gold standard in anti-wrinkle therapy.
So the second question is how other ingredients compare to tretinoin? They for sure might have less side effects but what’s about their efficacy vs. tretinoin. Would you use a product that is less irritating but also times less potent than tretinoin?
There is a discussion even among cosmetic chemists on a matter of how scientific a cosmetic science is today and should be.
Here are a few quotes from one of the leaders of cosmetic chemistry industry, Dr. Johann W. Wiechers from his book “Memories of a Cosmetically Disturbed Mind” (22).
"… Especially because cosmetic science is a commercial science. Why? Academic science is done for finding out how 'things' work, and to test or deny a hypothesis, whereas commercial science is science done for finding reasons for selling your product."
"A large proportion of us are doing science to sell products and not to explain things. Cosmetic science is therefore very often comparative science (how good is my product or my ingredient doing relative towards my competitor’s product or ingredient) instead of explanatory science (how does my product work).”
So we conclude here that, cosmetic science, being an integral part of cosmetic industry, focused to sell new molecules to manufacturers or products to the end consumers, by default, has a built-in bias to demonstrate their efficacy.
In our approach we don’t use the same level of evidence needed for the drugs, otherwise very few, if any, cosmetic ingredients would qualify to prove their anti-wrinkle efficacy. However our team came up with several common sense criteria that allow us to reach an ultimate objective - separate the wheat from the chaff and say, with a significant level of confidence: “yes, this ingredient actually works”.
So the key objectives of this approach is to answer 2 questions:
Answers to both questions are mandatory for a particular ingredient to be included into CreamScan anti-wrinkle rating . If a study reports that the ingredient has a certain anti-wrinkle efficacy but it’s impossible to measure it or the precise results of wrinkle reduction are not reported, such study will not be taken into account.
When we say “prove”, we mean that the study design doesn’t leave lot of space to question its results. It doesn’t have to be flawless, but it should close all the major “loopholes” used by “selling science”. In order to achieve the objectives, we created a list of eligibility criteria to evaluate the available research papers supporting anti-wrinkle claims of topically applied actives. We describe them below.
Drug industry uses animals to test both efficacy and safety before trials on humans. Cosmetic industry does so as well, but often it does not go beyond an animal part. We don’t discuss here an ethical part of it, just purely a matter of evidence.
There are a lot of studies that confirm certain ingredient as “potentially effective to fight photoaging” based exclusively on animal tests. There are studies on hairless mice, guinea pigs, Yorkshire pigs, rats, even Drosophila flies. Such studies are significantly cheaper and easier to arrange but their results cannot be taken as evidence of potential efficacy for humans.
Plus skincare industry can definitely afford a study on humans, normally they are not related to any serious side effects and potentially beneficial for the volunteers taking part.
That sounds quite obvious, but some manufacturer talk about the potential effectiveness of certain skincare formulations referring to a research when the substance was taken as a dietary supplement or topically, but combined with oral intake. Collagen is an example.
We include the studies with topical application of the active ingredient and exclude the ones with oral or combined. Under topical application we mean an use of emulsion exclusively and no instrumental interventions like microneedles or injections.
That simply means that we only take into consideration studies that were done on real people using the products. No ex-vivo and in-vitro studies, trials reporting only on histological parameters were taken into consideration.
We completely agree here with a top cosmetic chemist Perry Romanowski who said:
“The problem is that usually these lab studies do not translate to positive effects when used in real-life products. Just because an ingredient shows a benefit when applied directly to human skin cells in a petri dish does not mean it will have any effect when delivered directly to the skin from a lotion or other personal care product.”
Now when we stay with studies on humans, we want at least 20 people to take part in a study. That is still very low if you compare with drug industry where the clinical trials go on hundreds and even thousand patients. But believe it or not, out of all the studies we reviewed, at least 30% employed fewer than 20 subjects.
Why 20? We had to pick a number to exclude the studies with ridiculously low number of participants. There are a lot of studies done on 15, 10 and even 7 volunteers. When one group uses placebo (vehicle) and another - active formulation (verum), that would give at least 10 subjects in each group. One subject result is going to have 10% weight in total group’s result, which is still a lot, but we definitely don’t want to go above that.
That's a big one (like an article within article), but the most important one.
The active ingredient cannot be applied to the skin by itself in a 100% concentration. It such concentration it will be either very irritating, or from chemistry standpoint it will be unstable, or it won't be able to penetrate the skin by itself, or all together. So, it needs a so called vehicle to be delivered to the skin and absorbed.
The vehicle is normally a basic moisturizer. Let's imagine Nivea Crème, it could be a perfect vehicle to deliver an active ingredient into the skin.
So can we then take our “Nivea Crème”, add an active ingredient in it, make a study on 20 people and claim that this active ingredient is a new remedy against wrinkles? Well, we can, and actually, about a half of the studies claiming an anti-wrinkle efficacy of the researched ingredients did it exactly this way.
So what's the problem with that?
The problem is that Nivea Crème, should such study be made, would be extremely likely to significantly reduce wrinkles by itself. Because it is proven that a basic moisturizer can produce a substantial effect on wrinkles.
We found out, and that's confirmed by others (2,18), that the anti-wrinkle effect of basic moisturizer used as a vehicle (placebo) was comparable or even superior, to the active ingredient. Among all studies we reviewed, we included the ones that meet the following 3 conditions: 1) robust studies that meet with all the criteria listed in this article, 2) vehicle (not positive) controlled and 3) with objective and quantitative method to evaluate the efficacy like profilometry or computer-based grading of wrinkle reduction. Out of all studies we reviewed, we ended with 12 that qualify (3-14).
Then we calculated the average efficacy to reduce various wrinkle parameters, such as average skin roughness (Ra), peak-to-valley average amplitude that reflects the “deepness” of the fine wrinkles average maximum depth (Rz), wrinkle volume, surface area, etc. These calculations were done for vehicle only and vehicle + active.
The actives in those studies are: ascorbyl Tetraisopalmitate (tetra-isopalmitoyl ascorbic acid), ascorbic acid, asiaticoside, copper tripeptide-1, ectoin, niacinamide, palmitoyl pentapeptide, retinaldehyde (retinal), retinyl propionate, tretinoin.
The results were to our surprise:
That means that if there is no vehicle control in the study, any, literally any existing ingredient on Earth can claim its anti-wrinkle efficacy just being added to a moisturizer that can produce the same anti-wrinkle effect alone.
We will discuss it in more details elsewhere to elaborate the fact that a basic moisturizer is the most efficient part of the anti-wrinkle strategy.
The key hypothesis is that “use of skin moisturizers should delay or slow down the rate of persistent wrinkling, perhaps by plasticizing the stratum corneum (upper skin layer), thereby diminishing the formation of temporary wrinkles during facial expression and thus the potential for persistent wrinkles” – that was a one of the conclusions in a 8 year (!) study to find out how expression lines transform into persistent wrinkles (18) also confirmed by other studies.
Now let's go back now to the anti-wrinkle efficacy. How can we isolate the effect of the active from the effect of the vehicle-placebo-moisturizer? There is only one way to do it. Compare the results of vehicle + active vs. vehicle alone.
So studies claiming that a cream with a magic ingredient improved wrinkles by x% vs baseline, are inconclusive, if they don't compare a treatment formulation to the same formulation, but without an active: vehicle (placebo). Because the wrinkle reduction effect can be caused not by a reviewed active ingredient, but by other moisturizing ingredients present in the formula.
A few words about positive control. We see it as a possible, although less reliable alternative to a vehicle control. It is less reliable because in a double-blind study, in case with placebo/vehicle control, a researcher doesn't know what result he looks at, a vehicle or an active, so he has no room for potential bias.
In case with positive control, however, a researcher, even if blinded, has room for bias. Because to prove that the reviewed ingredient has the same efficacy as positive control, both formulas can be given the same efficacy evaluation, in case the subjective evaluation method is used (like expert grading of wrinkle severity). For that a researcher doesn't need to know what result he is looking at, he just may give the same evaluation to all the results and at the end we can claim that ingredient X has the same efficacy as tretinoin, for example.
Therefore we only take into consideration positive control, if there is no subjective method used to evaluate the results of the participants, but only an objective and quantitative method like computer grading or profilometry.
The last part is about comparability of a vehicle or positive control. As we said above, the whole point of a vehicle control is to exclude the anti-wrinkle effect of other ingredients in a formula. The only way to achieve that is to compare the same formulations, one with active and another one without it, or, in case with a positive control, with another active.
We assume that such an approach is always the case, even if exact active and vehicle formulations are not reported. But in some studies it's obvious that the vehicle is different from the active formulation not only in active ingredient content.
For example in a key study for anti-wrinkle efficacy of adenosine (20), it is said “A placebo cream known from previous studies to have no effect on wrinkles was used as control.” Even if ignore the fact that this previous study is not specified here, why do we think that using a completely different vehicle formulation (even with proven zero efficacy on the wrinkles) is fundamentally wrong?
Because it's not about the efficacy of a vehicle, we agree here that it can be zero. It's about the efficacy of other ingredients in a formulation containing 0.1% of adenosine. How do we know that this formulation reduced wrinkles (no doubt about it) because of adenosine and not because of glycerin, mineral oil and other moisturizing ingredients it could have contained?
There is no way to know that unless we also see the results of exactly the same formula with glycerin, mineral oil but without adenosine.
It may be obvious that the active formulation and a vehicle should be used the same way, the same number of times per day, on the same part of the face (in case of the spit face study, on the symmetrical part of the face).
But again, we see studies where it's not the case: otherwise a perfect study, that met all our criteria, confirming the anti-wrinkle efficacy of bakuchiol (21) on par with retinol, failed on delivering the equal and comparable conditions for an active and its positive control. According to the study design “the subjects were instructed either apply 0.5% retinol cream to their full face nightly or the 0.5% bakuchiol cream to their full face twice daily as a thin layer.”
So we compare here 2 exactly the same formulas with only difference of active ingredients, which is great. But why one should be used twice a day and the other one just once? Is it because of bakuchiol that bakuchiol containing formula achieved the same results as the retinol one or because it was used twice more often and delivered better moisturizing reducing wrinkles? No way to know…
In conclusion, we found that around a half of the studies reviewed, claiming anti-wrinkle efficacy are not vehicle/placebo/positive controlled. We don't talk here about rare ingredients but actually about the well known ones like trendy peptides or ascorbic acid derivatives. So that was a big exclusion factor.
A double-blind study is the second part of what is often referred to as the most reliable evidence or “gold standard” - randomized, double-blind, placebo-controlled trial.
A double-blind study means that neither the participants nor the experimenters know who is receiving a particular formulation. This is done on purpose to prevent bias in research results.
A lot of studies in skincare are organized or sponsored either by skincare manufacturers or raw materials suppliers for the industry. They are clearly interested in a positive outcome of the clinical trial.
Many existing trials still rely on subjective evaluation of clinical results by dermatologists or experts. Given the relatively small size of sample in skincare clinical trials, even one error could result in a significant shift in results.
We believe that double-blinded study is extremely important factor to deliver trustworthy results in skincare.
The last part of the randomized, double-blind, placebo-controlled “gold standard”. Randomization means that participants are randomly distributed in several comparable groups, normally the one that uses an active formulation and another one – vehicle.
All participants are unique, they have different severity of wrinkling, genetic differences and age. Randomizing people aims to balance these factors and eliminate a selection bias that can influence the final result.
Natalia K. Spierings reviewing the efficacy of Vitamin A (2) challenges the details given in research papers about blinding and randomization: “In addition, only one trial (Kim et al.) described the method used to achieve randomization. In two of the trials described as double-blind studies, it was not specifically stated that the investigator was blinded. None of the trials described how, in fact, blinding was achieved”
We don't go that far and if it is stated in the study that it was randomized, double-blind, placebo-controlled study, for the purpose of simplicity, we assume that all the parts of this formula were done correctly by a research team.
This simply means that we don't take into account studies that combine several actives in one formula, for example something like: “Jagdeo et al. Novel Vitamin C and E and Green Tea Polyphenols Combination Serum Improves Photoaged Facial Skin”
Even if it's a robust, well designed study published in respected journal, we simply cannot draw any actionable conclusions out of it about the efficacy of a single active ingredient. Probably it's effective only because of Vitamin C and green tea doesn't add anything. No way we can know from this type of study.
By the way it's a common practice when a questionable or less researched ingredient is added to the proven one like niacinamide or retinol in order to claim the overall efficacy and create an impression that there are 2 or more effective ingredients in the formula, not just one.
Here is an example: “Bouloc A, Vergnanini AL, Issa MC. A double-blind randomized study comparing the association of Retinol and LR2412 with tretinoin 0.025% in photoaged skin. J Cos- met Dermatol 2015; 14:40–6.”
LR 2412 is a molecule developed by L'Oreal Research (INCI: Sodium Tetrahydrojasmonate) marketed in various L'Oreal brands like Lancôme, Vichy and La Roche-Posay as an effective anti-aging ingredient. Retinol is a potent anti-wrinkle ingredients with well proven efficacy, as well as tretinoin, so no matter what you add to the formula, if retinol is already is present in it, the formula is very likely to demonstrate anti-wrinkle efficacy.
If the research team aimed to understand the efficacy of LR2412, it should have used a formula where LR2412 is the only one anti-wrinkle ingredient and compare it with exactly the same formula but without LR2412. What would be the result, we don't know.
There are also quite a lot studies out there that review so called “novel” formulations – products that combines several active ingredients in one formula. This is often done to support the claims of the specific product marketed by a manufacturer.
There is nothing wrong about it, especially because manufacturers are naturally interested in investing research money rather to support their own formulations than generic molecules. But again, this type of study is useless to understand the efficacy of a particular active.
The same goes for a combination routine like serum with one active plus moisturizer with another active, or several step regimens with different active ingredients, when for instance one product is applied in the morning and second one in the evening.
Last point here which is obvious: we should know the concentration of the active in a formula. It's the case in almost all studies we went through.
In short, we need to see clear numbers by how many percent the wrinkles were reduced by active formulation vs. a vehicle (placebo) in order to make a conclusion how effective is a certain active ingredient. That breaks down to a few criteria:
Results should be measured by a researcher, not by a participant. Method of evaluation can be objective (profilometry, computer image evaluation) or subjective, in this case by trained experts/dermatologists (wrinkle severity gradual evaluation), but we don't take into consideration studies where participants' self-assessment of wrinkle improvement is the only result reported.
As Johann W. Wiechers wrote in his book Memories of a Cosmetically Disturbed Mind “Questionnaires can be leading, sometimes even misleading. Tell a person s/he is testing antiwrinkle products and any product will work (to some extent).” (22)
Result should be reported as percentage of wrinkle reduction vs baseline or we should be able to calculate it with the data reported. It's often a case when results are reported as change of a certain wrinkle parameter vs. baseline (before treatment) in absolute value without precising what baseline value was. In this case we understand that there was a certain effect, we may even make a conclusion this effect was superior to placebo/vehicle, but we can't understand how significant it was.
For example, if a study reports that average skin roughness improved by 10 points, it can be a significant 20% improvement if the baseline value before treatment was 50 or it can be only 5% improvement if the baseline value was 200.
Both vehicle and placebo numbers are reported. We need to understand how the active formula vs. placebo/vehicle (see why in Chapter 4). Often, in a placebo-controlled studies, some results are reported only for an active formulation. Results of placebo/vehicle either are not reported at all or reported without any numbers saying something like “It (active) was statistically significant while vehicle was not”.
It doesn't necessarily mean that the researched ingredient is ineffective, probably it showed some level of efficacy, it's just we don't know what exactly, based on the data provided - because a researcher team didn't make it crystal clear when reported the results.
The other examples when data is reported only for active treatment in a format “improvement vs. baseline vs. placebo”. It may seem a valid format, but there is an important nuance and we've seen it in the study.
Let's say an active formula reduced wrinkles by 5%. At the same time subjects using placebo/vehicle formulation, got their wrinkles worse by 10%. It's quite unusual as normally vehicle/placebo has a positive effect on wrinkles, but in a few studies we've seen this case, probably because of the low moisturizing properties of the vehicle.
So the researchers motivated to demonstrate the efficacy of an active treatment, report that an active has decreased the wrinkles by 15% vs. placebo/vehicle, which is mathematically correct. It's just the wrinkle were reduced only by 5% and that's the number we record as the efficacy result of this ingredient. That's why we need to see a transparent reporting of both results for active and vehicle.
Last but not least, results should be peer reviewed. That means they should be published in journals that have peer review process when an articles before been published are reviewed by other experts (often called a Peer Review Board).
This criteria aims to ensure the article's quality although we've seen quite a few articles that were peer revied and still their conclusions were questionable.
Many of ingredients claiming anti-wrinkle efficacy have no clinical studies published but they have as a evidence of efficacy just a brochure done by a manufacturer of this raw material (this is the case for most of peptides). Even if this brochure says there was a clinical trial, it was normally done on a very small sample and without placebo/vehicle control.
So there is no guarantee that a peer-reviewed study is a robust one, nevertheless, this criteria helps us to avoid really low quality papers.
In short, we can summarize our inclusion criteria as:
peer-reviewed, double-blind, randomized, placebo-controlled study clearly reporting the results of a topical application of a formula with a known concentration of a single active ingredient done minimum on 20 subjects in vivo.
We reviewed more than a hundred ingredients marketed by a skincare industry and raw material manufacturers as having anti-wrinkle effects or potential. Only 12 ingredients have scientific papers that qualify to this rule and we can say yes – they have a proven efficacy confirmed by a robust research. You can learn what these 12 are here.