Explanation:
        If evidence of
          dependence between two or more items is suspected, this feature can be
          investigated by combining these items to produce a single, more extended item,
          called a subtest in RUMM [sometimes referred to as a testlet in other
          discussions].
        If two or more dependent
          items are summed, then the subtest produced will be a single polytomous item
          whose maximum score is the sum of the maximum scores of the individual items
          involved. In this case, the interpretation of the threshold estimates is
          different from those associated with a typical polytomous item composed of
          ordered categories.
        The structure of the
          latter, where the categories are intended to be ordered, demands that
          thresholds which define boundaries between the categories, are also ordered. On
          the other hand, when subtests are formed from a set of dependent items, there
          is no reason for the thresholds to be ordered. Indeed, the more local
          dependence accounted for by the subtest, the more the thresholds will be
          disordered.
        This effect follows
          because the more dependent the items within a subtest, the more the scores of
          the subtest are extreme scores, that is, closer to 0 and the maximum on the
          subtest, for any person location. Therefore, given a person location, the
          probability of a response in the middle categories is less than it would be
          with independence, and to produce these lower probabilities in the middle categories, the threshold estimates are closer together than under
          independence, and indeed may be reversed.
        At the same time, it is
          important to note that the difference in difficulties
          of the items of the subtest will trade off with their local dependence. As the
          variance of the difficulty of the component items within a subtest gets
          greater, so the thresholds of the subtests get further apart. In the case when
          there is little local dependence and reasonable differences in difficulty, then
          the effects trade off and the thresholds will be ordered.
        If there is local
          dependence among the items placed in subtests, then the subtest analysis will
          generally show better fit than the original analysis. This is in part because
          responses dependence has been taken into account and absorbed into the
          thresholds, and in part because the reliability (person separation) with the
          subtests will be reduced resulting in loss of relative power in the test of
          fit.
        References:
        Andrich, D.
          (1985). A latent trait model for items with response dependencies:
            Implications for test construction and analysis. In S. Embretson (Ed.), Test
          design: Contributions from psychology, education and psychometrics.
          Academic Press, New York. (Chapter 9, pp. 245-273.)
        Andrich, D.
          (2005) Rasch models for ordered response categories. In B. Everitt
          & D. Howell (eds.) Encyclopedia of Statistics in Behavioral Science.
          New York: John Wiley & Sons. Volume 4,
          pp. 1698-1707.