COSMIC Sizing Forums Estimating Superiority of COSMIC Function Points over Story Points for estimating Agile Projects

This topic contains 14 replies, has 6 voices, and was last updated by  Mark Thias 1 year, 2 months ago.

  • Author
    Posts
  • #5707

    Charles Symons
    Participant

    The paper ‘Effort Estimation with Story Points and COSMIC Function Points – An Industry Case Study’ reports on the performance of estimation models built using Story Points and COSMIC Function Points and shows that, for the organization in which the data was collected, the use of COSMIC Function Points leads to estimation models for effort with much smaller variances. The study also demonstrates that the use of COSMIC Function Points allows objective comparison of productivity across tasks within a Scrum environment.

  • #5711

    Thomas Fehlmann
    Participant

    And let’s add that with COSMIC you have also testing metrics, even safety and security measures included for free with the count – something impossible with story point, because story points are not based on a ISO 14143 compliant model for software.

  • #5713

    Mark Thias
    Participant

    I read the position paper and note that there are some rather serious misunderstandings of Story Points. I will give you some detailed feedback along with some Mike Cohn reference material.

    • #5754

      Charles Symons
      Participant

      Hi Mark, I suddenly remembered a very interesting dialogue on this forum about 16 months ago on the subject of using COSMIC on Agile projects. The principal contributor was Denis Krizanovic of Aon Australia. You will find a lot there about how he uses COSMIC in practice in the context of Agile projects.
      One day soon, we will update our Guideline on using COSMIC in Agile projects using his experience.

      I hope it is of interest and helpful.

  • #5715

    Thomas Fehlmann
    Participant

    I agree with Mark as Story Points do not relate to size of software but are direct team estimations of cost. COSMIC FP as well as IFPUG FP and other FP measurement methods relate to functional size, while cost estimation results from the questionable belief that functional size still is the major cost driver as it was in the past. In today’s world of legal, security and safety concerns related to SW development, I think functional size is no longer everywhere the dominant cost driver; however, it still might be for the said case study. Thus, the title of the paper suggests a wrong comparison between two methods addressing different issues, while the paper is worth reading.

  • #5717

    Alain Abran
    Participant

    Thomas,
    When you write that Story Points are ‘direct team estimations of cost’ do you mean ‘estimation of effort’? This is indeed what I have observed teams are estimating with Story Points.

  • #5720

    Ben Ootjers
    Participant

    It seems that there is fundamental misinterpretation of the story point metric in the mentioned paper. The right approach to story point metric (as also indicated by the cited Cohn book) is to use relative measurement and not directly estimate effort in hours or days. The actual duration or hours are derived by using velocity; not directly estimated by the team as done in the paper. Story points estimates are kept consistent by using reference stories. Even though the team productivity changes, the number of story points is not affected. In the paper a traditional hour estimate is equal to story points; that is really bad practice.

  • #5722

    Charles Symons
    Participant

    Ben,
    There is an intensive discussion proceeding off-line with Mark Thias, who commented earlier on this same point. We’re hoping to come back with a joint statement soon. This organization may not be adopting best Agile practice in the way it uses (or interprets) Story Points. The important question is whether this invalidates the study’s conclusion that in this organization COSMIC sizes correlate better with effort than what this organization calls SP.

    At the moment, I still conclude the findings are valid and significant.

  • #5724

    Ben Ootjers
    Participant

    Charles, I think the results for the COSMIC estimation is probably valid (I did not look into that into detail), but the Story Point estimation is not valid, so the comparison, and findings are also not valid.

  • #5726

    Charles Symons
    Participant

    Ben,
    you may claim, and we may accept, that this organization’s use of SP for estimation is not the proper way to use SP and that therefore this study is not representative of how accurate use of SP’s are for estimation in organizations generally. (Btw it would be nice to see hard evidence that SP-based estimating is better than reported in this organization.) But that does not make the findings of the study in this organization ‘not valid’. This paper reports the facts found in a particular organization. That’s all.

    All the findings are ‘valid’, unless you’re suggesting the researchers made mistakes. Equally, it is not unreasonable for us to suggest that the results might be of wider relevance

  • #5745

    Alain Abran
    Participant

    Within the Conclusion Section of our Position paper we had made it very clear that our findings were for a specific organization: I am providing below the 5 instances (undelined here) within the Conclusion section where we clearly pointed out (on purpose) the scope-limitations of our analysis:

    1. This study has examined the performance of the Planning Poker / Story Points as an estimation technique within an organization where software is being delivered using the Scrum methodology.
    2. When compared to actual effort for 24 tasks completed at the industry site in the case study, estimates with Story Points in this organization were shown to have led to large under- and over-estimates, with an MMRE of 58% – see Figure 2.
    3. This study has allowed us to emonstrate that in this organization the development team implemented the tasks at a sustained pace within a range of 2 to 3 hrs/CFP for 20 or the 24 tasks, iteration after iteration, throughout the period for which the data was collected – Figure 4.
    4. The COSMIC based estimation model built with the initial 24 tasks in this organization had a much smaller estimation variance (i.e., an MMRE of 28%) – Figure 3.
    5. The analysis of productivity extremes within this data set allowed identifying a functional reuse context within this organization that had led to major productivity gains for 2 tasks with such high functional reuse.
    6. This paper illustrated as well how this industry site used this estimation model equation and expected MMRE variance to estimate an additional task …

    And of course, the closing statement in the Position Paper is subject to the caveat repeated 5 times in the preceding paragraphs of this Conclusion section:

    ‘In summary, although the Planning Poker / Story Points are widely recognized and used in the Agile community, the COSMIC measurement method provides objective evidence of the team performance as well as better estimates.’

    The key issue is not either whether a company follows (or does not not follow) ‘all’ of the best practices, and whether or not that has an impact on whether the estimation process leads to good estimates or better or worst estimates.

    Most of Marks key concerns on ‘invalidity’ are not at that level, but on what appears to be our mis-interpretation of using the term of ‘Story Points’ and the impact of this apparent mis-interpretation. (i.e. when using the concept of ‘estimated days’ for the other concept labeled ‘Story Points’ as a unit-less number without a meaning).

  • #5748

    Alain Abran
    Participant

    I have some additional thinking, referring to ‘Best practices’:

    A- ‘Story Points and Planning Poker’ are indeed ‘popular’ and ‘well-known’, but it would quite a strecht to classify them as ‘best practices’!

    B- The numbers within the Fibonnacci sequence are indeed unit-less (or better as Charles pointed out, ‘dimension-less’). However, their usage within a context must be combined with something else to be useful. In ‘Story Points’ the Fibonacci sequence are combined with ‘something’ based on the intuition (of a person or a group), and in yesterday discussion, it was initially with day in one sub-step, next with a ‘calibration of days’, and in the next sub-step, the team members are told to forget about the unit used for the calibration. To be observed that ‘forgetting’ about something does not make that ‘something’ disappear: it still exists and it still has been done as an activity.

    Therefore, while Fibonacci sequence is unit-less, Story Points is not ‘unit-less’ nor dimension-lessÙ:
    in the company where the data collected, the Story points have been dimensioned initially on ‘days’, and they were told to forget about it, and from there on, they have used instead the label Story Points.

    Simarly, Mark has explained how the dimensioning in done in multiple sub-steps (his two-step calibration process instead of a single direct step) + the step of ‘forgetting how it was done’ and renaming. In summary, the labelling in Story Points and the assertion that it is ‘unit-less’ are definitively far from ‘best practices’. Others would not hesitate to call this ‘invalid’.

    We should therefore be very careful with claims of validity when asserting that Story Points are unit-less, and statements that specific topics in the Position Paper are invalid: a number of items in the Position Paper might not correspond to what is widely believe within a community, but this is not a sufficient basis to consider these as invalid.

  • #5750

    Mark Thias
    Participant

    1. VERY IMPORTANT: Story Points are unit less. The technique I described to you which I use to the get the team to unit less, relative sizing, part a,b as you describe is irrelevant and in no way should be part of the discussion. If my technique is part of your discussion it will really confuse matters for us bridging the agile community. Other coaches may use a different way to get teams to relative sizing perhaps never bringing up the idea of an initial duration target. So please, if you would, I need you to do the following, perhaps its “trust me” scenario. If you do not agree, let us please have another phone conversation.

    • Do not apply units to Story Points either initially or otherwise. Assume the team can jump right into to relative sizing using the Fibonacci sequence through the planning poker exercise. For the purposes of discussion and the paper, my technique to getting teams to relative sizing is irrelevant.
    • There is no such thing or concept as an initial calibration and then a re-recallibration. I’m hoping I can talk you out of using these terms, process, concepts…it does not exist in agile.

    2. I agree with your feedback about not using the words ‘best practice’ in the response.

  • #5756

    Charles Symons
    Participant

    I think we can agree on the following:

    • The paper by Alain and others describes their findings in a particular company that does not follow Agile best practice as recommended by, amongst others, Mike Cohn. Mike was one of the founders of the Agile movement whose view on best practice is described in his book ‘Agile Estimating and Planning’.
    • Alain’s findings that COSMIC sizes correlate much better with effort on sprints than Story Points correlate with effort, as practised in this company, therefore do not mean that these findings are generally valid for all organisations that follow Agile best practice.
    • Having said that, we all acknowledge that Story Points (however interpreted, as ‘unit-less’, or a measure of size of a Story, or as an indication of effort for a sprint) are only valid in the context of a particular project team. There is no expectation that Story Points will correlate any better with sprint effort than as described in Alain’s paper, even if project teams do follow Agile best practice.
    • Story Points do not provide any basis for activities such as comparing performance across project teams, contracting with software suppliers, or for estimating the total effort for new projects. Mike is therefore exploring the use of COSMIC in Agile projects, as an objective measure of the size of user stories that can also be used for these other purposes.
  • #5758

    Mark Thias
    Participant

    Hi Charles,

    Thank you as always for your collaboration.

    Indeed from the perspective of absolutes SP is a bit crazy, no argument there. For a reasonably fast way to estimate it works pretty well saving untold hours in speculative estimation. With Story Points being unit-less, it can never be construed as anything but an estimate aka a promise. As we all know estimates, hours, man days etc are rapidly converted to a promise by leadership. This is often the reason developers struggle so mightily with estimates asking for more and more definition (locked down requirements).

    As far as sizing, you know how I feel about this by now. I’m hoping for something better, like COSMIC. I want to compare delivery for teams and vendors. I look forward to reading the deck. THANK YOU!

    I continue to recommend the word ‘unitless’ because it is the accepted term used to describe SP in countless books and literature. We would lose the agile community pretty fast.

    -Mark

You must be logged in to reply to this topic.