Contrary to popular belief, scaling has nothing to do with difficulty of the subject. It is actually not possible to quantify “difficulty” objectively, so I honestly do not know why this myth keeps persisting.
Scaling is completely data driven and is based on how many “smart” people are in cohort - which is quantified by looking at how everyone performs in their other subjects. This may be correlated with a subject’s “difficulty” so there is likely a classic confusion between correlation vs causation (but difficulty is subjective, for example I find English way more difficult than Maths Ext 2 yet the latter scales better).
A simplified illustrative example is say there are only two subjects A and B. Subject A is taken by 10000 students, of which 3000 also take subject B (and no one else takes subject B). Suppose that the average for subject A is 50% for all 10000 students. However, if you only look at the subset of 3000 students in subject A who also took subject B their average is 60% in subject A. This suggests that, on average, the cohort in subject B is stronger than the cohort in subject A. Therefore, subject B should have a higher scaled average than subject A as the final scaling outcome.
Notice that this is entirely driven by the data and has nothing to do with whether subject A is more/less “difficult” than subject B.
This is how scaling actually works for Extension subjects where a direct comparison possible, but for non-extension subjects it gets very complicated once you compare many permutations of other subjects at the same time.
So hypothetically, if the Maths Ext1 cohort somehow scored a lower average in Maths Adv than the broader Maths Adv cohort as a whole, then Maths Ext1 would actually get a lower scaled average than Maths Adv.