See important post on my work website here
Well, I’ve finally got round to programming a model that:
- Asks you just five best-worst scaling questions – you choose your “most agreed with principle” and “least agreed with principle” – people take 2-3 mins to answer this tops.
- Runs a best-worst scaling (BWS) exercise on just YOUR five answers.
- Spits out three things:
- A pie chart showing how likely each of the six main options (continued EU membership/Norway option/ Switzerland option/ Canadian option/ Turkish option/ World Trade Organisation option) would best satisfy YOUR principles
- A pie chart showing the predicted chances of you personally supporting each of the five principles
- A pie chart showing the predicted chances of you personally rejecting each of the five principles
Thus, the first chart tells you, based on which of these five principles we could “get” under each of the six models, what are the chances of getting “as much as we want” from each model of a new British-European relationship – the six models (one REMAIN, five BREXIT) .
This, like all CORRECT best-worst scaling, is an individual model, giving you PERSONALISED results, not “you averaged with others”.
We can, of course, average across people, slice and dice the results across sex/gender/political affiliation etc, to find out what model is most popular in certain groups. But the point is, my model doesn’t NEED to do that. All because just five BWS questions tell me everything I need to know about what you value.
Gold dust for all the campaigns – and the government, as it struggles to negotiate what type of new relationship would command majority support in the country.
I have deliberately answered the survey as a “hypothetical REMAINer” to show what they should have done – namely made the single European market something people understood and fought for, above other factors.
There are lots of scenarios – including what probably actually happened in that people were in reality “sure” they disliked free movement of people and/or EU budget contributions but unsure about their SEM/FTA/CU support – which lead to a BREXIT outcome as the most likely to achieve their preferences….your relative preferences for these determines which BREXIT model (hard/soft) is most likely to suit you.
Campaign managers/constituency parties/national party executives as well as Jo(e) Public would be very interested in this.
Wow. In this article Will Hutton interviews Amartya Sen. A crucial quote:
“…you have to take in, somehow, the unattractiveness of the last as well as the attractiveness of the first candidate.”
Wow, quantifying the worst as well as the best?
Which group has been at the forefront world-wide of doing this?
Yep, we’ve been way ahead of our time.
OK I’m breaking my self-imposed law within a few hours.
I usually have utmost respect for Ben Goldacre and don’t want to get into trolling territory on twitter but this is a simplistic statement. The first statement is true. The second is highly debatable if you stratify by age.
It is well known (see Bill Mitchell amongst a wealth of others, many of whom could not be seen as “outsiders” but are well within the mainstream) that unemployment in southern EU countries is appalling amongst the young. 50% or so. People with PhDs living at home with parents and, if they’re lucky, doing some barista work. All courtesy of the banking rules that force them to “live within their means – like a household”. All a nonsense paradigm of course if you understand how money is created and destroyed. But the results are in and have been in for many years now. There is, of course, a strong affinity with the EU, given the benefits of the past. However, recent ECB policy means the young can’t afford a home, and get bare-bones healthcare.
New year = digital spring cleaning time! Ugh. No matter how future-proof you aim to be with how you structure files, aim to work seamlessly across PCs etc it never takes long for reality to change and you realise you need to do the rigmarole again.
When admin is done I’m back to the project looking at comparing Case 2 BWS estimates with DCE ones. I shall look with “fresh eyes” since I haven’t worked on it since before xmas. (Plus we need to get this rounded off so we can submit and get paid, hehe.)
Then it’s the (long-delayed) big marketing push for TF Choices LTD. I’ve had a good number of proposals and funded projects come my way so far but can’t rest on my laurels…time to make sure a load of marketers and others know what I can do for them, in addition to the academic community I was part of!
I can’t think of anything methodological I want to shout about today (phew, they think)….I’ll continue to post anything big or of key relevance but as there are only so many hours in the day and company stuff must come front and centre in 2017 it’s likely that my comments and posts will be related to things I’m doing at the time (like Case 2 vs DCEs) rather than detailed posts triggered by twitter or citation alerts I get.
Sometimes I feel like a stuck record.
Six years ago I gave a lecture in Sydney explaining that :
- Happiness is an idiotic public policy goal.
- It isn’t the same as well-being (quality of life) – which is a desirable goal.
- The two bear no relationship in old age. Something that the UK and USA have now both already shown. Happiness looks “good” in old age but it’s a cohort/longitudinal effect: either the current cohort of older people are using very different yardsticks “we got through the war” or older people naturally think along the lines of “we got to age 80, we must be doing OK” (despite a wealth of evidence to suggest huge numbers are unhealthy, poor, lonely and generally in need of greater resources).
Well, the Sydney Morning Herald has picked up on one of these silly life satisfaction/happiness surveys which has (finally) cast doubt over the “everything is hunky dory in old age” conclusion….which was what they were proclaiming, along with the US and UK media, until recently, when questions over such “happiness/life satisfaction” methods began to be raised on the back of (now not particularly “new” academic research). Better late than never I suppose.
But do you know why this is so frustrating?
The methods to properly value well-being/quality of life quantitatively, without using these “life satisfaction numbers” were developed – ON THE BACK OF AUSTRALIAN RESEARCH – in 2008 and applied in the UK and Australia very quickly by me and my colleagues.
Why is asking a person for a numerical score bad? Well the bottom line, these were debunked by 2001 in the top academic marketing journals in the world. Did you know that “4” in Mandarin also has connotations with “death” and so is considered an unlucky (and unused) number by a lot of Chinese people? That two other numbers we could use in these scales have a Cantonese connotation with something that an internet family filter will kill – suffice to say “flaccid” and “hard” are the relevant descriptors.
That’s just for starters.
We have methods to properly describe well-being using a framework devised by one winner of the Economics “Nobel” prize and a method to properly value it using a framework of another.
And the ironic thing? All this work came together based on work done in Sydney 10 years ago.
But the Sydney Morning Herald, although running one or two little pieces on quiet news days, never had the courage to showcase that Australian research. It wasn’t until the Brits and the Yanks showed it that it now appears as a major article. Tall poppy Syndrome anyone?
For almost 6 years I worked just down the road from the SMH and the ABC, yet was only asked to do “filler” stories. I feel sad rather than annoyed. By the time I left Australia I’d realised what the score was. It’s a shame. Sexy news, rather than good research wins out – and that’s NOT specific to Australia. But when the result is whole generations of people being under-resourced, one has to stop for a moment and think.
This post replies to some requests I have had asking me to respond to a paper concluding that DCEs are better than BWS for health state valuation. To be honest I am loathe to respond, for reasons that will become apparent.
First of all, let me clarify one thing that people might not appreciate – I most definitely do not want to “evangelise” for BWS and it is not the solution in quite a few circumstances. (See the papers coming out from the CHU-9D child health valuation study I was involved with for starters – BWS was effectively a waste of resources in the end….”best” choices were all we could use for the tariff.)
I only really pushed BWS strongly in my early days as a postdoc when I wanted to make a name for myself. If you read my papers since 2007 (*all* of them) you’ll see the numerous caveats appear with increasing frequency. And that’s before we even get to the BWS book, where we devote an entire chapter discussing unresolved issues including the REAL weaknesses and research areas for BWS (as opposed to straw men I have been seeing in recent literature).
OK now that’s out of the way, I will lay some other cards on the table, many of which are well-known since I’ve not exactly been quiet about them. I had mental health issues associated with my exit from academia. I’m back on my feet now doing private sector work for very appreciative clients, but that doesn’t mean I want to go back and fight old battles….battles which I erroneously thought us three book authors had “won” by passing muster with the top mathematical psychologists, economists and others in the world during peer review. When you publish a paper in the Journal of Mathematical Psychology (the JHE of that field) illustrating a key feature/potential weakness of a DCE (or specifically Case 2 BWS) back in 2008 you tend to expect that papers published in 2016 would not ignore this and would not do research that showed zero awareness of this issue and as a result made fundamental errors – after all, whilst we know clinical trials take a while to go from proposal to main publication, preference studies do NOT take 8+ years to go through this process. I co-ran a BWS study from conceptualisation to results presentation in 6 days when in Sydney. Go figure.
So that’s an example of my biggest frustration – the standards of literature review have often been appalling. Two or three of my papers (ironically including the JHE one, which includes a whopping error which I myself have repeatedly flagged up and which I corrected in my 2008 BMC paper) seem to get inserted as “the obligatory BWS reference to satisfy referees/editors” and in many cases bear no relation to the point being made by authors. Alarm bells immediately flash when I read an abstract via a citation alert and see those were my references. But it keeps happening. Not good practice, folks.
In fact (and at a recent meeting someone with no connection to me said the same thing) in certain areas of patient outcomes research the industry reviews are considered far better than academic ones – they have to be or get laughed out of court.
Anyway, I have been told that good practice eventually drives out bad. Sorry, if that’s true, the timescale was simply too long for me, which didn’t help my career in academia and raised my blood pressure.
Returning to the issue at hand. I’m not going to go through the paper in question, nor the several others that have appeared in the last couple of years purporting to show limitations of BWS. I have a company to run, caring obligations and I’ve written more than enough for anyone to join the dots here if they do a proper literature review. My final attempt to help out was an SSRN paper. But that’s it – without some give and take from the wider community, my most imaginative BWS work will be for clients who put food on the table and who pay – sometimes quite handsomely – for a method that when properly applied shows amazing predictive ability together with insights into how humans make decisions.
Now, of course, health state valuation is another kettle of fish – no revealed preference data etc. However, Tony, Jordan and I discussed why “context” is key in 2008 (JMP); I expounded on this with reference to QALYs in my two 2010 single authored papers, and published a (underpowered) comparison in the 2013 JoCM paper (which I first presented at the 2011 ICMC conference in Leeds, getting constructive criticism from the top choice modellers on Earth). So this issue is not particularly new.
It’s rather poor that nobody has actually used the right design to compare Case 2 BWS with DCEs for health state valuation…I ended up deciding “if you want something done properly you have to do it yourself” and I am very grateful to the EuroQoL Foundation for funding such a study, which I am currently analysing with collaborators. I don’t really “have a dog in this fight” and if Case 2 proves useful then great, and if not then at least I will know exactly why not…and the reasons will have nothing to do with the “BWS is bad m’kayyyyy” papers published recently. (To be fair, I am sometimes limited in what I can access, with no longer having an academic affiliation so full texts are sometimes unavailable, but when there’s NO mention of attribute importance in the abstract, NOR why efficient designs for Case 2 are problematic my Bayesian estimate is 99.99% probability the paper is fundamentally flawed and couldn’t possibly rule BWS in or out as a viable competitor to a DCE.)
If you’d like to know more:
- Read the book
- Read all the articles – my google scholar profile is up to date
- Get up to speed on the issues in discrete choice design theory – fast. Efficient designs are in many many instances extremely good (and I’ve used them) but you need to know exactly why in a Case 2 context they are inappropriate.
If you still don’t understand, get your institution to contract me to run an exec education course. When I’m not working, I’m not earning, full stop.
I’m now far more pragmatic about the pros and cons of academia and really didn’t want to be the archetypal “I’m leaving social media now” whinger. And I’m not leaving. But I am re-prioritising things. Sorry if this sounds harsh/unhelpful – I didn’t want to write this post and hoped to quietly slip beneath the radar, popping up when I thought something insightful based on one of BWS’s REAL disadvantages or Sen’s work etc was mentioned. But people I respect have asked for guidance. So I am giving what I can, given 10 minutes free time I have.
Just trying to end on a positive note – I gave a great exec education course recently. It was a pleasure to engage with people who asked questions that were pertinent to the limitations of BWS and who just wanted to use the right tool for the right job. That’s what I try to do and what we should all aim for. I take my hat off to them all.
I recently had a mole removed by a GP with a special interest (GPSI) in dermatology. It was an interesting experience, given that the first ever discrete choice experiment I conducted elicited patient preferences for exactly this type of doctor and specialty.
The study was piggy-backed onto an early (the first?) trial of GPSI care. That trial established equivalence of care with the traditional consultant-led secondary care model (for the large proportion of cases that are routine enough for GPSI care to be appropriate). The DCE, however, showed resistance to GPSI-type care among patients, on average. Now, this was unsurprising: we knew no better and quoted average preferences, which mean nothing usually in DCEs (since you are averaging apples and oranges). Subgroup analyses I did established which patient subgroups were open to GPSI-type care (and when), and those results were all very predictable.
It is the wording we were strongly encouraged to use for the attributes (such as the doctor description etc) that is the subject of this post, particularly in the light of my personal experience of such care “at the sharp end”. We did not use the actual job titles of the doctors: had we done so, we would have given the respondents the choices between “seeing a member of a consultant-led team, which may or may not be the consultant him/herself” versus “seeing a GP who has had (considerable?) special additional training in dermatology”, making it clear that (1) many people don’t see the consultant, contrary to what they believe, and (2) a GPSI is perfectly qualified to deal with their condition and if anything non-routine is found, they are instantly moved to the consultant-led team’s care.
Now, I know why the triallists didn’t like this: patients see “GP” and instantly form (often incorrect) opinions. That was brought home to me when I saw a doctor at the local hospital in Nottingham (actually a private treatment centre subcontracted by the NHS): he never revealed he was a GPSI until we started “talking shop” and suddenly his ID badge was held up in front of me with the exclamation “I was one of the first GPSIs in dermatology appointed!” My referral letter said I would see (consultant) Dr X or a member of his team. Hmmmm. Thankfully I had no preconceptions, and received top notch care – I would certainly see him again if I needed to. (Of course I looked up this GPSI subsequently and it turns out he specialised in surgery first before moving to General Practice to improve conditions for family life, so he was particularly well qualified.) But it did illustrate, albeit anecdotally, that what was really required was a DCE with “labels” (the actual doctor type”) to capture the true patient preferences: that would focus minds on the need for a public education campaign to reduce the stigma associated with GPSIs. What we did, although not misleading in terms of describing the doctors, brushed the underlying problem under the carpet. (So we should have run a labelled DCE – we knew no better then but I am using my own experience to illustrate a serious problem here that continues unabated in health. That’s for another day, however.)
The other attribute I would, with the benefit of being an actual patient, change was location of care. The DCE heavily implied that non-hospital care would be a local general practice. Now, of course, if your general practice doesn’t have the facilities to do minor surgery then this may be grossly misleading. Indeed I had to travel further than the local hospital to get to the GPSI’s surgery for my mole removal. As it happens it didn’t matter: distance as the crow flies was not the important factor in my ability to get there. However, it immediately made me slightly annoyed at the guidance I as the DCE lead received when I did the study. The wording we used was, again, “technically correct” in that the choice was between a place of care that was convenient and local versus not, but I’m fairly sure a non-trivial number of our respondents could have made incorrect assumptions about these attribute levels. I know I did, and I ran the DCE!
It made me a bit (more) cynical about the motives of certain parts of academia: I’d already seen via twitter a much heralded result of a trial I know about that, shall we say, could have been improved upon immensely. Furthermore, I had pause for thought recently when I learnt that some members of industry consider academia-led literature reviews and so-called systematic reviews in certain areas of health to be not worth the paper they’re written on. (I can concur on that regarding recent reviews in my own field). In a time that has seen a huge amount of industry-bashing for selective release of information/publication it really does act as a reminder that some areas of academia need to take a good hard look at their own conduct. Plus, just to be fair, I do shout out about the amazing groups I have worked with or continue to work with. I just feel Ben Goldacre and Danny Dorling were bang on the money in their beliefs (informed by different evidence, which was particularly damning) that bad practice by academia and its associated institutions contributes to the general lack of confidence by the public in the “elites” and how “having your own facts”, whilst of course ludicrous, is a perfectly understandable public reaction to elites that no longer seem to uniformly put the public good first.
As usual I shall make the caveat that there are great groups I work with and this isn’t just “academia bashing”. I just offer constructive criticism based on my own experiences (and mistakes) and give examples of the kind of lack of transparency that cleverer people like Ben and Danny have highlighted as barriers to getting academia more support among the general populace.
You need to go over to my company blog if you want a posting today.
I put it there as it’s more of a private-public sector issues one….though I wasn’t the one laying into sections of academia and related fields….twas people at the IJE conference 🙂
This is not exactly a moan (since in some cases I’m requesting fewer references to one or two of my own papers, which is all very nice!). It’s just a reminder that BWS has been an evolving technique over many years and I continue to note too many people just seem to add the JHE 2007 paper as “the BWS reference” when it really isn’t supporting what they are doing or saying.
I’m not been afraid to admit when I’ve done something incorrect/misleading, or when the field has moved on and an earlier paper is becoming outdated. (So when I call others on bad referencing, rest assured that I do the same for myself.)
Some points to note:
- The JHE article was the first comprehensive explanatory Profile Case (Case 2) BWS paper. However, the “marginal models” there involved coding that although gives correct point estimates, give misleading summary statistics like log-likelioods, by not taking account of the sequential nature of the data. Thus, a choice from 5, means only 4 options are available for the second choice.
- This was corrected ASAP – the 2008 BMC paper on dermatology study corrected this, so marginal sequential models should really reference this paper.
- References to “dual/multi stage choice tasks” (primarily to get QALYs) should start with my 2010 Pharmacoeconomics paper, since that was the first to propose these (including the DCE+TTO rescaling) method. Too many researchers reference later papers.
- I was also first in explaining why the “death state” can’t be valued in a DCE without duration and a higher resolution design – in 2008 I wrote about this in Pop Health Metrics, with the God of math psych, Tony Marley, amongst others. I also pointed out why variance scale factors can be highly problematic in DCEs/other choice models. I certainly wasn’t first on the latter point – you should be looking to papers in the 1990s by Swait & Louviere, and Hensher and Louviere for that.
- First reference to a Case 1 BWS study is in The Patient: Patient-Centered Outcomes Research (2010) by Louviere and Flynn (to my knowledge – I am happy to be corrected if wrong).
- If you’re comparing Case 2 BWS with DCEs you really should be understanding and discussing how they differ, which was introduced in detail in the 2013 JoCM paper by Flynn et al. Subsequent discussion in the book (2015). DO NOT conclude that either method is “wrong”/”right” purely on basis of comparison of results from each task. Our work explains why they might differ.
- For Case 3 BWS I’m not the key person, Emily Lancsar was/is big in introducing and applying this in health. Please also note the correct name for this is the “multi-profile case” as agreed by Louviere, Marley and me in preparation for the book. Like the profile case, renaming was done so as to better describe what made Cases 2 and 3 distinct from other Cases.
- First reference to a peer-reviewed published Case 2 study was from the 1990s by Szeinbach et al; first UK study was 2006 by our team in BJD.
- Finally, the emerging problems with highly efficient designs: Rose and Bliemer hypothesised this back in 2009; I and team published the first within-subject confirmation in Pharmacoecon 2016.
Thus, it’s just a guide to help practitioners get the correct reference for BWS and associated conceptual issues. Hope it helps. I may add to this if I think of other issues that are incorrectly attributed.