Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-6055

Customize handling of DateTimeTypeName based on type system

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Background:

      In ISO SQL, a TIMESTAMP comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery DATETIME. BigQuery chose to give its native TIMESTAMP type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.”

      ISO SQL has 2 other TIMESTAMP types. The TIMESTAMP WITH TIME ZONE simply attaches a time zone to the existing TIMESTAMP type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type – TIMESTAMP WITH LOCAL TIME ZONE – is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery TIMESTAMP.

      This creates the confusing situation where a BigQuery DATETIME is actually an ISO TIMESTAMP, and a BigQuery TIMESTAMP is actually an ISO TIMESTAMP WITH LOCAL TIME ZONE.

      Problem:

      This is very similar to CALCITE-5424, which only applied to literals (e.g. TIMESTAMP '2023-10-17 12:00:00'). We need a similar solution for the DateTimeTypeName construction which is used in other contexts (e.g. CAST('2023-10-17 12:00:00' AS TIMESTAMP)). Seems like we can use a very similar approach: introduce a DATETIME type name, and use SqlUserDefinedTypeNameSpec for both that and the TIMESTAMP type names. During validation and sql-to-rel conversion, we can look up the type names in the UDT map. If no mapping is found for DATETIME (which would occur for any dialect besides BigQuery), an error would occur during validation, because that type does not exist in that dialect. If no mapping is found for TIMESTAMP, we would fall back on the canonical TIMESTAMP type, preserving the existing behavior for all other dialects.

      In summary:

      CAST('2023-10-17 12:00:00' AS TIMESTAMP) should produce a TIMESTAMP WITH LOCAL TIME ZONE for BigQuery, but a regular TIMESTAMP for all other dialects.
      CAST('2023-10-17 12:00:00' AS DATETIME) should produce a TIMESTAMP for BigQuery, but throw a validation error for all other dialects.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wnoble Will Noble
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: