Category Archives: Surveys

BREXIT-REMAIN redux

eu_support_graph copy

 

Well, I’ve finally got round to programming a model that:

  • Asks you just five best-worst scaling questions – you choose your “most agreed with principle” and “least agreed with principle” – people take 2-3 mins to answer this tops.
  • Runs a best-worst scaling (BWS) exercise on just YOUR five answers.
  • Spits out three things:
    • A pie chart showing how likely each of the six main options (continued EU membership/Norway option/ Switzerland option/ Canadian option/ Turkish option/ World Trade Organisation option) would best satisfy YOUR principles
    • A pie chart showing the predicted chances of you personally supporting each of the five principles
    • A pie chart showing the predicted chances of you personally rejecting each of the five principles

 

 

 

 

Thus, the first chart tells you, based on which of these five principles we could “get” under each of the six models, what are the chances of getting “as much as we want” from each model of a new British-European relationship – the six models (one REMAIN, five BREXIT) .

This, like all CORRECT best-worst scaling, is an individual model, giving you PERSONALISED results, not “you averaged with others”.

We can, of course, average across people, slice and dice the results across sex/gender/political affiliation etc, to find out what model is most popular in certain groups. But the point is, my model doesn’t NEED to do that. All because just five BWS questions tell me everything I need to know about what you value.

Gold dust for all the campaigns – and the government, as it struggles to negotiate what type of new relationship would command majority support in the country.

I have deliberately answered the survey as a “hypothetical REMAINer” to show what they should have done – namely made the single European market something people understood and fought for, above other factors.

There are lots of scenarios – including what probably actually happened in that people were in reality “sure” they disliked free movement of people and/or EU budget contributions but unsure about their SEM/FTA/CU support – which lead to a BREXIT outcome as the most likely to achieve their preferences….your relative preferences for these determines which BREXIT model (hard/soft) is most likely to suit you.

Campaign managers/constituency parties/national party executives as well as Jo(e) Public would be very interested in this.

 

states worse than dead

No, this isn’t another moan by yours truly about how the valuation people deal (in)correctly with states worse than the death on in health economics valuation exercises (phew).

This tweet interested me.  There are all sorts of things you could do with a discrete choice experiment (DCE) to measure the trade-offs such patients make. When at UTS, we did a DCE that did two things, one novel and one not so novel. The first was an attitudinal one that found there are three segments among Australian retired people (our sample was around 1100 total) when you got them to tell you what statements about life they related to most and least – Best-Worst Scaling. We did something never done before – feed back to them their own results after that survey that they could print off, bring to their doctor to discuss, use as the starting point for and end-of-life care plan etc: results of this form a chapter in the book referenced. Of course the doctors at the sharp end in ICUs had warned us that thanks to TV programmes the general public has much higher expectations about the success/acceptablility of these dramatic interventions than is true in practice, but you could do the same survey with patients. In fact the bare bones of the survey are still live at the link and you can see how you compare with older Aussies.

The second DCE was (by DCE standards) very very simple, but was done to get a handle on the trade-offs people woul make regarding the kinds of interventions in the survey in this Twitter post and unfortunately won’t give you personalised results.

These types of DCEs should become routine. They can be done on touchscreen tablet PCs etc when the patient is waiting to see the doctor, they can give personalised results – not aggregated ones like in the bad old days. People like them, and like to know how they compare with others – the older generation love those surveys comparing them to others just as much as the younger “Facebook generations”. C’mon people, this survey is great and very very informative but we can move forward even further and do it today.

Where next for discrete choice health valuation – part one

Where next for discrete choice health valuation – part one

My final academic obligations concerned two projects involving valuation of quality of life/heath states. Interestingly, they involved people at opposite ends of the age spectrum – children and people at the end of life. (Incidentally I am glad that the projects happened to be ending at the time I exited academia, so I didn’t leave the project teams in the lurch!)

These projects have thrown up lots of interesting issues, as did my “first generation” of valuation studies (the ICECAP-O and –A quality of life/well-being instruments). This blog entry will be the first of three to summarise these issues and lay out some ideas for how to go forward with future valuation studies and, in particular, construction of new descriptive systems for health or quality of life. In time they will be edited and combined to form a working paper to be hosted on SSRN. The issue to be addressed in this first blog concerns the descriptive system – its size and how it can/cannot be valued.

The size of the descriptive system became pertinent when we valued the CHU-9D instrument for child health. More specifically, an issue that arose concerned the ability of some children to do Best-Worst Scaling tasks for the CHU-9D. The project found that we could only use the “best” (first best actually) data for reporting. This is not secret: I, and other members of the project team are reporting this at various conferences over the coming year. I may well be first, at the International Academy of Health Preference Research conference in St Louis, USA, in a few weeks. We knew from a pilot study that children exhibited much larger rates of inconsistency in their “worst” choices than their “best”: the plot of best vs worst frequencies had a bloody big part of the inverse relationship curve missing! (This was the first time I saw this.)

 

 

Best vs versus frequencies of the type in the child health study

Best vs versus frequencies of the type in the child health study

When you plot the best choice frequency against the worst choice frequency of each attribute level you should see an approximately inverse relationship. After all, an attractive attribute level should be chosen frequently as best and infrequently as worst; an unattractive attribute level should be chosen frequently as worst and infrequently as best. Yet in the child health study, the unattractive attribute levels (low levels of the 9 attributes), although showing the small “best” frequencies, did not show large “worst frequencies: they were all clustered together around a low worst frequency. This showed that the kids seemed to choose pretty randomly when it came to the “worst” choices – particularly bad attribute levels were NOT chosen more often as worst than moderately bad attribute levels. This made the part of the “inverse relationship” curve be missing! First time I’d seen that. It led us to made a big effort to get a lot of worst data (two rounds) and make it easy (by structuring the task with greying out of previously chosen options). However, it didn’t really work unfortunately.

I stress that despite my deliberately controversial title for the IAHPR conference, we CANNOT know if it was (1) the valuation method (BWS), (2) the descriptive system (CHU-9D) or (3) just plain respondent lack of knowledge that caused kids to be unable to decide what was worst about aspects of poor health.

(1) could be true if kids IN GENERAL don’t think about the bad things in life; (2) could be true if the number of attributes and levels was too large – the CHU-9D has 9 attributes, each with 5 levels, which is the largest instrument I have ever valued in a single exercise (I was involved in the ASCOT exercise which split the instrument in two); (3) could be true if kids can do “worst” tasks, but in general they just can’t comprehend poor health states (since kids from the general population are mostly highly unlikely to have experienced or even thought about them).

In the main study I hoped that “structured BWS” eliciting four of the nine ranks in a Case 2 BWS study would help the kids. More specifically:

(1) They answered best

(2) Their best option was then “greyed out” and they answered worst

(3) This was in turn greyed out and they answered next (second) best

(4) Which was in turn greyed out and they answered next (second) worst.

This in theory gave us four of the nine ranks (1,2,8,9). It was particularly useful because it enabled us to test the (often blindly made) assumption that the rank ordered logit model gives you utility function estimates that are “the same” no matter what ranking depth (top/bottom/etc) you use data from. Unfortunately our data failed this test quite spectacularly – only the first best data really gave sensible answers. So the pilot results were correct – for some reason in this study, kids’ worst choices were duff. (Even their second best data were not very good.)

Of course, as I mentioned, we don’t know the reason why this was the case, so we must proceed with caution before making controversial statements about how well BWS works among kids (ahem, cough, cough…)

But given the mooted idea to devise an “ICECAP for kids”, we should bear in mind the CHU-9D findings when constructing the descriptive system. I certainly don’t want to criticise the very comprehensive and well-conducted qualitative work done by Sheffield researchers to construct the CHU-9D. I merely pose some questions for future research to develop an “ICECAP for kids instrument” which may cause a tension between the needs of the descriptive system and the needs of the valuation exercise.

Would an ICECAP for kids really need 5^9=1953125 profiles (quality of life states) to describe child quality of life (as the CHU-9D did for health)?

My personal view is that too much of the health economics establishment may be thinking in terms of psychometrics, which (taking the SF-36 as the exemplar) typically concentrates on the number of items (questions/dimensions/attributes). A random utility theory based approach concentrates on the number of PROFILES (health/quality of life states). This causes the researcher to focus more on the combination of attributes and levels. When the system is multiplicative (as in a DCE), the number of “outcomes” becomes large VERY quickly.

Thus, some people are missing the point when they express concern at the small number of questions (five) in the ICECAP-O and –A. In fact there are 4^5 possible outcomes (states) – and moreover of the 1024 possible ICECAP states, over 200 ICECAP-O ones are observed in a typical British city. That makes the instrument potentially EXTREMELY sensitive. So I would end with a plea to think about the number of profiles (states) not the number of attributes. Can attributes be combined? That, and the statistics/intuition behind it, will be the subject of the second blog in this series.

 

Copyright Terry N Flynn 2015.

This, together with the accompanying blogs, will form a working paper to be submitted to SSRN. Please cite appropriately if referring to these issues in academic papers.

 

open letter to sen and mcfadden

Dear Professors McFadden and Sen,

I am writing to alert you to a paper in AER last year that makes serious failures in acknowledging:

(1)   your main paradigm-shifting work in discrete choice modelling for which you won the Nobel prize (McFadden) and

(2)   Work that used the work of McFadden to operationalise the Capabilities Approach for which you won the Nobel prize (Sen).

I wish to make it clear that one author of the paper in question has already publicly acknowledged (2) above and given that McFadden’s work is (arguably) “more mainstream” in economics, the failure to acknowledge (1) makes the paper’s errors even more serious – at least, in the eye of economists.

The paper in question is Beyond Happiness and Satisfaction: Toward Well-Being Indices Based on Stated Preference. Daniel J. Benjamin, Miles S. Kimball, Ori Heffetz, and Nichole Szembrot. Am Econ Rev. 2014 Sep; 104(9): 2698–2735.

I mention above that I have already had a recognition by Miles Kimball (who, by reputation, is probably the senior author) of a lack of referencing of the work to operationalise Professor Sen’s Capabilities Approach.

(see attached screenshot of Miles’s recognition of his poor referencing on Twitter)

ScreenHunter_01 Sep. 13 12.30

 

 

 

 

Yet, even if I were to accept “the “Sennian literature being often outside the mainstream economics literature” as a mitigating factor (which I do not), proposing a discrete choice model to elicit public values for elements of well-being is most definitely not “outside mainstream economics”. Of course your work on discrete choice models, Professor McFadden, is entirely “mainstream”.

A friend/colleague of mine working in the “more mainstream” branches of economics considers my shock and annoyance entirely justified – she cannot believe the authors were unaware of the work of the team in which I worked from 2001-2015 to use stated preference discrete choice models to value elements of the Capabilities Approach.

In terms of my reputation, and that of the wider group, I worked with Professor Jordan Louviere for many years and a textbook which we, together with Professor Tony Marley (acknowledged in Professor McFadden’s Nobel lecture) co-author is being published by CUP this month. (Please see my google scholar profile for details of my publications).

I do recognise and respect Professor Sen’s issues with putting numerical values on elements of Capabilities, but the “ICEPOP” team in which I worked has, I would posit, gone the furthest in providing a framework in which the importance of these can be valued, leaving the societal decision rule to be decided separately, by you Professor Sen, or others – please see numerous publications by Professor Joanna Coast and me.

Anyway, our group has been the worldwide leader in marrying your respective approaches in order to value well-being and I consider it insulting, to say the least, that a paper in AER should appear, claiming that crown, when our papers have been available in reputable (and in some cases hardly non-mainstream economics) peer-reviewed journals for a decade.

I make no request of you for further explicit action – I left academia several months ago, partly due to this kind of behaviour. I merely would ask that if further work by these authors or others, “passes by you or your colleagues”, you make them aware that valuing Capabilities using discrete choice models is well-progressed already and that they should be checking the publications of me, and the wider ICEPOP team.

Kind regards

Terry

www.terryflynn.net

Terry N Flynn MA(Cantab), MSc, PhD
Director, TF Choices LTD
http://www.tfchoices.com/contact@tfchoices.com

Tel: +44 (0)115 888 0809 (UK); +61 (0)2 8006 0907 (AUS)
Twitter: @tfchoices @tflynnhealth

 

Copyright Terry N Flynn September 2015

tfchoices ltd website live

My company website is live! My sister (once again) has done sterling work for me – we are going to change the “revolving cycle of pictures” to better convey the kinds of human decision-making problems I will be dealing with. However, it has the main elements in place.

It has its own dedicated blog, which will be much more focussed on client needs etc. I already have my first post in mind, which will go up in next few days.

Go check it out!

 

bank account almost there

Had a two-hour meeting with a business manager at a bank this morning.

Despite my absence from the UK for 6 years nothing suspect/lacking was triggered in the credit check (phew). This particular bank do a lot of intensive work at the first meeting, but the result is that I should have my business bank account live on Thursday.

Most other banks would say “at least two weeks”. My Dad also banks with this particular bank for this company account and others have recommended it so I feel pretty confident this will work out.

I will get http://www.tfchoices.com up and running ASAP, with associated contact details. But my g-mail address (see contact tab) will suffice in the meantime and is monitored.

Things are going to move quickly now as my first contract has a deadline for being signed and sealed! @tfchoices is my twitter feed for the company – it will be a lot more formal than the existing one so follow if you think you might be interested in commissioning me or collaborating on joint bids.

Moody teenagers? Giving them a greater say in health policy might solve this

Cross posted from The Ethics Blog

We have all heard of moody teenagers. Maybe we have them, or can remember being one. Recent research with my Australian colleagues suggests they may genuinely have more difficulty living with poor mental health than adults do.

Specifically, compared to the general public aged 18+, they are more likely to view mental health related impairments as being worse than physical disabilities.

This is not just an academic curiosity – if true, it means society is probably under-investing in child mental health. To explain why, we must first understand how most European countries decide on health funding priorities.

In general, disabilities with the greatest capacity to benefit from treatment are prioritised. To find out whether pain, depression, or some other, physical, impairment to health is worst – and therefore has the greatest potential benefit from treatment – nations conduct large population-based surveys. These require adults to make choices between lots of possible impaired health states in order to find out just how bad these are, relative to each other.

Of course, people often disagree on what is worst, and by how much, so decisions must be made as to whose values matter most. European nations generally agree that it is unethical to allow the rich to dictate what disabilities are most deserving of resources. Instead of “one € one vote”, it is “one person one vote”: taking a simple average of every adult’s values does this naturally.

Whilst this sounds fair and democratic in terms of process, it could be leading to uncomfortable outcomes for our moody teenager. Why? Well, if poor mental health is genuinely worse for teenagers than adults believe it to be then mental health interventions might not get funded: for example, if adults think pain is much worse, pain medications will be prioritised instead. This is because only adults are being asked for their health values, not teenagers.

So perhaps adults just don’t remember what it’s like to be young and we should use the teenagers’ values for health interventions that affect them?

Maybe not. There is a saying “age brings wisdom” and perhaps adults’ greater experience of illness means their values for mental health impairments are the correct ones. Maybe younger people have simply not experienced enough in life to know what aspects of illness are really worst. After all, immaturity is one reason why younger teenagers are not allowed to vote.

The ethical issues surrounding at what age teenagers can have sex, vote and make independent decisions in public life all become relevant here. However, “one person one vote” has one more disturbing implication that is relevant for people of all ages. By taking an average of everyone’s views, national health state value surveys include lots of healthy people who have no idea what it is like to live with severe illness. Does this matter? Well, it turns out that to the depressed patient in desperate need of a new anti-depressant it probably does.

Patients and the general public tend to disagree on which is worst – extreme pain or extreme depression. The general public gets the final say and my next blog entry will discuss how and why we might use the health values of patients themselves in priority setting instead.

Happiness isn’t quality of life if you’re old

The subject of happiness, particularly among older people, has come up (again) in the media. I reckon they trot out the latest survey results whenever there’s a slow news day. I think it’s no coincidence the newest stories have appeared in the slow month of August.

Anyway I shall keep this short as I’ll rant otherwise. Once again, neither happiness nor life satisfaction is the same as quality of life and we can argue til the cows come home as to which of the three (if any) is truly well-being.

First of all, if I can find the time to write up a follow-up to the paper I published on the mid 2000s survey of Bristolians I will show this:

Five year age bands showing mean levels (after rescaling) of self-rated happiness versus scored quality of life in Bristol

Five year age bands showing mean levels (after rescaling) of self-rated happiness versus scored quality of life in Bristol

The two track reasonably closely until retirement age. Then whilst happiness continues to rise, quality of life certainly does not. The wealth of other evidence on health, money, friends, etc from the survey suggests our QoL, the ICECAP-O instrument, is the better measure of overall well-being.

We are not the only ones to find this. A large US study pretty much concluded they didn’t know WTF older people were doing when they answered life satisfaction/happiness questions but they sure don’t answer them the same way that younger adults do. Older people use a different part of the numerical scale (typically a higher portion, all other things being equal). That’s rating scale bias and there is a huge and growing literature on it.

Stop asking these dumb questions. There are good alternatives.