I. Forward: origin and use of this document This document is created to capture use-cases that define what a "quantity" is for use in the Virtual Observatory (VO) data model (DM). This is needed as even though discussions on the data modeling list and at various ADASS/IVOA conference show a general overlap in sentiment, it is too vague an understanding upon which to proceed further and do actual modeling/code creation. This document is an attempt to formally define what the community wants to consider as a "quantity" within the VO DM framework. I have started this document based on Jonathan McDowell's summary of the quantity discussion, adding in supplementary comments, as I was able to discern them from the mailing list or sent privately to me. Any issues that where mentioned about implementation philosophy have been dropped, as that should *follow* the "whatis" discussion and the creation of requirements for quantity. The document is created as a general community editable resource, so, if you see that your point of view is missing or miss-represented, then please edit your point into this document. You are asked only to obey a few "rules": 1. Don't remove someone else's comment/use-case unless they concur with your edit. 2. When adding comment/use-case please leave your name in brackets "[]" so that others know to whom to refer if they have questions about it. 3. Give your new use-case an number id so that it maybe referred to succinctly in other discussions. For those of you who are wondering about a definition of "use-case", I refer you to the following paraphrase taken from Erikkson, 1998 ["UML Toolkit", Wiley Computer Publishing", pg 45]: "A use-case is used to describe what a system should do. A use-case model is built through an iterative process during which discussions between system developers lead to a requirement specification on which all agree." Hopefully, as in the parable about blind men feeling the various parts of an elephant we can proceed through this exercise to fully discover and describe the "true" quantity. Regards, -brian II. Use cases for "quantity" 1. A Quantity describes numerical values. [Thomas] [Thomas]: yes. [Plante]: (Oct 31) yes. 2. A quantity has a "name" attribute which allows description of the (phenomena, variable, etc) (Was: Is the name of the (phenomenon, variable, etc) part of Q?) [McDowell] [Dowler]: No (Dowler::Property name in a separate place) [Berry]: Yes (Berry::DataContainer: yes, include name and/or label) [McDowell]: Yes I would like to see a name as part of Q - then scalar Qs can serialize as a FITS keyword. [Thomas] : No. Keep it simple, not everything needs "name". If you need a quantity-based FITS keyword, then create the class "FitsQuantity" that adds the "name" attribute. Quantity should be an abstract class that never appears generically. e.g. its always makes its appearance as some concrete class that inherits from it. [Plante]: (Oct 31) It's probably a good idea; however, in practice, I expect that we won't be handling generic Quantity objects, but specialized versions (e.g. Frequency) in which the name is locked in. 3. A quantity may be a multi-dimensional entity (e.g. Q(i,j) with i, j being indices). [McDowell] (Was: Does Q support arrays? Multi-dim arrays?) [Dowler]: yes [Thomas]: Yes. We need to treat things like matrices/tensors. Every 'cell' in the matrix has the same units,data type. [Didelon]: no: model something simple) Reason: same unit for many samples, commonalities [McDowell]: I say yes, because we will need such a multi-dim object, and it will be exactly the same as Q except for the multi-dimensionality. So why do the same work twice? So much of astronomical life is N-dim array based. We might as well put it in at the basic level. [Plante]: In short, I say yes; however, I think arrays (of any D) only make sense if the values are fully homogeneous--i.e. same unit, same error model. Otherwise Q becomes much more complicated. It would be easier to create an aggregation of Qs to handle the individual tagging of components. 4. Quantities support multi-dim arrays with links to other Q's as axes, example: Flux(Wavelength) [McDowell] [Thomas]: Oct 29: yes Plante: no, higher level object to connect data Q with axis Qs McDowell: I agree with Ray, this should be a higher level object. Q should be the values associated with a single UCD (not counting modifiers like error, quality), and anything connecting two UCDs should not be in Q. (admittedly our CfA DataContainer object does do this WCS stuff, but I think we are trying to keep Q a little simpler.) 5. Array quantity and a scalar quantity be separate classes. [McDowell] (ex. Dowler::AtomicQuantity, Dowler::ArrayQuantity; [McDowell]: No. Array is not a separate class, simply the case n = 1 Failing that, at least a class inheriting via restriction from Array, not a separate derivation from Q. [Dowler]: (Oct 29): "really dislike the array of length 1..." (but I think this is just for the serialization, not the internal class representation, so perhaps reconciliation possible) [Thomas]: No. Agree with McDowell. 6. Quantity includes heterogeneous arrays. (with different UCD, units, type in cells of multi-dimensional object) [Thomas]: Ok, "no", but I would like to see another "base" level class that obeys the "quantity" interface and can group quantities (e.g. like "thomas::QuantitySet"). [Dowler]: No, but consider representation issues (ISO date, numerical error) [McDowell]: Probably no, at least for rev 1 7. Quantity needs to associate units with its value(s). [McDowell]. Everyone (I think!): Yes [Thomas]: Yes, but units themselves are an interface/abstract class described in a separate base package. 8. Quantity needs to have errors as part of its meta-data (rather than in a separate class like "measurement" that aggregates a "value" and "error" quantities). [Dowler]: No. Dowler::Measurement: not in Q lots of things in VO are not physical measurements and do not have errors [Thomas]: Yes. Data fusion requires errors in Q, errors *always* have (within a numerical factor) the same units as the values they describe. I suggest that Dowler::Measurement maps to Q (and Dowler::Quantity does not). [Berry]: Yes. Berry::DataContainer should include them [McDowell]: Yes, but don't model the Error object fully yet. 9. If errors are in Q, should there be a simpler class similar to Dowler::Quantity which does not contain errors? [McDowell] [Didelon]: yes, (Oct 30) [Thomas]: No. (qualified). If someone can come up with use-case for it, then Im ok with it. [McDowell]: I think no, there's no need for a 'Simple-Q' with no errors (as opposed to a Q with a null error), it doesn't add significant weight to the class (and in the XML serialization doesn't have to add any weight?) [Lemson]: (Oct 30): Dowler::Quantity is individual pixels - but errors may be correlated. Whole image is a Lemson::Result and not a Dowler::Quantity. [McDowell]: I like the idea that a single pixel can be a Quantity on its own, and an array of pixels can be a Quantity. Much fun will be hidden in the Error model. In particular, even in an image where the errors are correlated, one sometimes asks 'what is the absolute error on this pixel?', or 'what is the relative error on this pixel?', information that really is meaningful for just that pixel alone. Sometimes in contrast one asks: 'what is the error on the flux extracted from this group of pixels' in which case the array's Error is the thing you need to use. The fact that errors are correlated doesn't mean it's meaningless to ask a pixel what its error is, and so doesn't mean Quantity shouldn't have an Error object. 10. There is an intermediate astronomy/container-type object between Observation-level classes and Quantity. [Tody]: Adding quality etc to Q makes it no longer Q, but Tody::Dataset [McDowell]: I introduce this question because of Doug's comment; one can perhaps recast the continuum of opinions into a divide between those who want Q really simple (scalar value + unit, no array, no name, no error) and those who (like me) want Q to be the basis for containing everything except the astronomy (array values, unit, quality, errors, perhaps even coords). Maybe that's an indication there are two objects to be modelled, even if some of us think that using the extra, simpler object will mean more difficulty in writing properly general application code. [Thomas]: Yes. There may even be more than one intermediate "component" package level. 11. Quantity should contain quality class. [Berry]: Berry::DataContainer, yes (specific flags, not overall Obs quality) [Thomas]: Yes and no. "quality" is a type of accuracy so a 'hook'/pointer for it appears in quantity. But actual concrete classes of quality are defined in an outside package(s), most of which are outside the base level. [Tody]: No, keep Q simple [Micol]: No, keep Q simple [McDowell]: yes, I think it would be good to have this 12. Quantity supports string values (in addition to numerical ones) [Thomas]: Yes. this will bite us later in data-fusion/search issues if we don't make it a basic part. [Plante]: (Oct 29), no. (Oct 31) I agree with Gerard that string values should be handled as a "Classification". To be convinced otherwise, I need an example of string type that semantically supports the concept of an Error. [McDowell]: strongly yes 13. Quantity also be used for meta-data parts of other classes. [Thomas]: Strongly yes. It is entirely contextual (of the searcher) as to whether something is "data" or "meta-data" within any given class. Don't see how we could avoid. [Didelon]: says Dowler separation (in fact, layering) of concept and Q is good. Thomas seems to think he is arguing for separation of data and meta-data. I don't see the connection but perhaps I missed something. [Plante]: (Oct 31) In my mind, This is the primary use for the data model. As I've said before, "data" may use some other storage mechanism. 14. Quantity should describe its data type. [McDowell]: yes. [Thomas]: yes. [Dowler]: yes. 15. Quantities support "complex" types. [McDowell]. [Dowler]: yes. Dowler::Type = Ellipse2D, Oct 27 Dowler, polygon types (Oct 29) [McDowell]: Suggest we not rule this out, but an initial implementation would only support basic datatypes. [Thomas]: Yes, but agree that as a first-cut we consider the scalar "primatives" e.g. "float", "integer", "string", then move up to "vectors" THEN full-object types like "Ellipse2D". [Plante]: (Oct 31) Yes(?) Agree with McDowell statement(?). 16. Quantity should include coverage, completeness. (Everyone: no, belongs in Observation) [Thomas]: no. 17. Quantity can describe everything in a FITS file or VOTable. Everyone (?): No! Maybe this is true for Observation Thomas: Yes, I think, but this case appears to be ill-defined. I would phrase as "The quantity should be capable of serving as the base class for all the searchable, exchangeable, fusionable parts of VOTable/FITS structures." 18. Quantity has transform/mappings functionality. [Didelon]: no [Thomas]: No, but it should have "hooks"/pointers to some mapping interface. The actual concrete mapping classes should be in a separate package that lies at either the base or component package level. [Berry]: yes, since need to know why pixel 3 and pixel 4 are distinct) [Plante]: no [McDowell]: no, this should be the next higher level object) 19. Quantity has methods to describe quantity arithmetic? [Barnes]: yes [Thomas]: Yes, but as for mappings. In fact, may be a type of mapping. (Everyone else: maybe but not yet?)