ICU-2966 Remove these out of date files. Look in icu/source/tools/gentz/tzcode for new code.

X-SVN-Rev: 13984
This commit is contained in:
George Rhoten 2003-12-04 00:04:56 +00:00
parent 0fbe7eb84e
commit a944df48d8
8 changed files with 2 additions and 2380 deletions

View File

@ -1,114 +1,4 @@
Copyright (C) 1999-2001, International Business Machines Corporation
Copyright (C) 1999-2003, International Business Machines Corporation
and others. All Rights Reserved.
Readme file for ICU time zone data (source/tools/gentz)
Alan Liu
Last updated 2 Feb 2001
RAW DATA
--------
The time zone data in ICU is taken from the UNIX data files at
ftp://elsie.nci.nih.gov/pub/tzdata<year>. The other input to the
process is an alias table, described below.
BUILD PROCESS
-------------
Two tools are used to process the data into a format suitable for ICU:
tz.pl directory of raw data files + tz.alias -> tz.txt
gentz tz.txt -> tz.dat (memory mappable binary file)
After gentz is run, standard ICU data tools are used to incorporate
tz.dat into the icudata module. The tz.pl script is run manually;
everything else is automatic.
In order to incorporate the raw data from that source into ICU, take
the following steps.
1. Download the archive of current zone data. This should be a file
named something like tzdata1999j.tar.gz. Use the URL listed above.
2. Unpack the archive into a directory, retaining the name of the
archive. For example, unpack tzdata1999j.tar.gz into tzdata1999j/.
Place this directory anywhere; one option is to place it within
source/tools/gentz.
3. Run the perl script tz.pl, passing it the directory location as a
command-line argument. On Windows system use the batch file
tz.bat. Also specify one or more ourput files: .txt, .htm|.html,
and .java.
For ICU4C specify .txt; typically
<icu>/source/data/misc/timezone.txt
where icu is the ICU4C root directory. Double check that this is
the correct location and file name; they change periodically.
It is useful to generate an html file. After it is generated,
review it for correctness.
As the third argument, pass in "tz.java". This will generate a
java source file that will be used to update the ICU4J data.
4. Do a standard build. The build scripts will automatically detect
that a new .txt file is present and rebuild the binary data (using
gentz) from that.
The .txt and .htm files and typically checked into CVS, whereas
the raw data files are not, since they are readily available from the
URL listed above.
Additional steps are required to update the ICU4J data. First you
must have a current, working installation of icu4j. These instructions
will assume it is in directory "/icu4j".
5. Copy the tz.java file generated in step 3 to /icu4j/tz.java.
6. Change to the /icu4j directory and compile the tz.java file, with
/icu4j/classes on the classpath.
7. Run the resulting java program (again with /icu4j/classes on the
classpath) and capture the output in a file named tz.tmp.
8. Open /icu4j/src/com/ibm/util/TimeZoneData.java. Delete the section
that starts with the line "BEGIN GENERATED SOURCE CODE" and ends
with the line "END GENERATED SOURCE CODE". Replace it with the
contents of tz.tmp. If there are extraneous control-M characters
or other similar problems, fix them.
9. Rebuild icu4j and make sure there are no build errors. Rerun all
the tests in /icu4j/src/com/ibm/test/timezone and make sure they
all pass. If all is well, check the new TimeZoneData.java into
CVS.
ALIAS TABLE
-----------
For backward compatibility, we define several three-letter IDs that
have been used since early ICU and correspond to IDs used in old JDKs.
These IDs are listed in tz.alias. The tz.pl script processes this
alias table and issues errors if there are problems.
IDS
---
All *system* zone IDs must consist only of characters in the invariant
set. See utypes.h for an explanation of what this means. If an ID is
encountered that contains a non-invariant character, tz.pl complains.
Non-system zones may use non-invariant characters.
Etc/GMT...
----------
Users may be confused by the fact that various zones with names of the
form Etc/GMT+n appear to have an offset of the wrong sign. For
example, Etc/GMT+8 is 8 hours *behind* GMT; that is, it corresponds to
what one typically sees displayed as "GMT-8:00". The reason for this
inversion is explained in the UNIX zone data file "etcetera".
Briefly, this is done intentionally in order to comply with
POSIX-style signedness. In ICU we reproduce the UNIX zone behavior
faithfully, including this confusing aspect.
The gentz tool went away. Please look at icu/source/tools/gentz/tzcode/readme.txt

View File

@ -1,45 +0,0 @@
######################################################################
# Copyright (C) 1999-2001, International Business Machines
# Corporation and others. All Rights Reserved.
######################################################################
# A simple alias list. We use this to retain backward compatibility.
# For example, ICU has always defined the zone name "PST" to indicate
# the zone America/Los_Angeles. Unless we continue to have a zone with
# this ID, legacy code may break.
#
# This list is read by tz.pl and the alias names listed here are
# incorporated into the tz.txt file as clones.
#
# Format: alias_name unix_name # optional comment
ACT Australia/Darwin
AET Australia/Sydney
AGT America/Buenos_Aires
ART Africa/Cairo
AST America/Anchorage
BET America/Sao_Paulo
BST Asia/Dhaka # spelling changed in 2000h; was Asia/Dacca
CAT Africa/Harare
CNT America/St_Johns
CST America/Chicago
CTT Asia/Shanghai
EAT Africa/Addis_Ababa
ECT Europe/Paris
# EET Europe/Istanbul # EET is a standard UNIX zone
EST America/New_York # Linked to America/Indianapolis in Olson
HST Pacific/Honolulu # Olson LINK
IET America/Indianapolis
IST Asia/Calcutta
JST Asia/Tokyo
# MET Asia/Tehran # MET is a standard UNIX zone
MIT Pacific/Apia
MST America/Denver # Linked to America/Phoenix in Olson
NET Asia/Yerevan
NST Pacific/Auckland
PLT Asia/Karachi
PNT America/Phoenix
PRT America/Puerto_Rico
PST America/Los_Angeles
SST Pacific/Guadalcanal
UTC Etc/UTC # Olson LINK
VST Asia/Saigon

View File

@ -1,17 +0,0 @@
@echo off
REM Copyright (C) 1999, International Business Machines
REM Corporation and others. All Rights Reserved.
REM This script is a Windows launcher for the tz.pl script. For this
REM to work, the perl executable must be on the path. We recommend
REM the ActiveState build; see http://www.activestate.com. See the
REM tz.pl script itself for more documentation.
if "%OS%" == "Windows_NT" goto WinNT
perl -w -x -S "tz.pl" %1 %2 %3 %4 %5 %6 %7 %8 %9
goto end
:WinNT
perl -w -x -S "tz.pl" %*
if NOT "%COMSPEC%" == "%SystemRoot%\system32\cmd.exe" goto end
if %errorlevel% == 9009 echo You do not have Perl in your PATH.
:end

View File

@ -1,42 +0,0 @@
######################################################################
# Copyright (C) 1999-2003, International Business Machines
# Corporation and others. All Rights Reserved.
######################################################################
# Default zone list. If ICU cannot find an exact match for the host
# time zone, it picks a zone that matches the host offset. There may
# be many such zones, however, so it must know which one to select.
#
# This list is read by tz.pl and the default names listed here are
# incorporated into the tz.txt file as preferred default zones.
# Any conflicts (multiple defaults for the same offset) or absences
# (no defaults specified for an offset) are reported.
#
# Format: default_name # optional comment
Africa/Addis_Ababa
Africa/Cairo
America/Anchorage
America/Buenos_Aires
America/Chicago
America/Denver
America/New_York
America/Los_Angeles
America/Puerto_Rico
America/St_Johns
Asia/Calcutta
Asia/Dhaka # spelling changed in 2000h; was Asia/Dacca
Asia/Karachi
Asia/Riyadh89 # Pick the chronologically lastest of this group
Asia/Saigon
Asia/Shanghai
Asia/Tokyo
Asia/Yerevan
Atlantic/Azores # Windows lists Azores, Cape Verde
Australia/Darwin
Australia/Sydney
Europe/Paris
GMT
Pacific/Apia
Pacific/Auckland
Pacific/Guadalcanal
Pacific/Honolulu

File diff suppressed because it is too large Load Diff

View File

@ -1,328 +0,0 @@
######################################################################
# Copyright (C) 1999-2001, International Business Machines
# Corporation and others. All Rights Reserved.
######################################################################
# See: ftp://elsie.nci.nih.gov/pub/tzdata<year>
# where <year> is "1999b" or a similar string.
######################################################################
# This package handles the parsing of time zone files.
# Author: Alan Liu
######################################################################
# Usage:
# Call ParseFile for each file to be imported. Then call ParseZoneTab
# to add country data. Then call Postprocess to remove unused rules.
package TZ;
use strict;
use Carp;
use vars qw(@ISA @EXPORT $VERSION $YEAR $STANDARD);
require 'dumpvar.pl';
@ISA = qw(Exporter);
@EXPORT = qw(ParseFile
Postprocess
ParseZoneTab
);
$VERSION = '0.2';
$STANDARD = '-'; # Name of the Standard Time rule
######################################################################
# Read the tzdata zone.tab file and add a {country} field to zones
# in the given hash.
# Param: File name (<dir>/zone.tab)
# Param: Ref to hash of zones
# Param: Ref to hash of links
sub ParseZoneTab {
my ($FILE, $ZONES, $LINKS) = @_;
my %linkEntries;
local(*FILE);
open(FILE,"<$FILE") or confess "Can't open $FILE: $!";
while (<FILE>) {
# Handle comments
s/\#.*//;
next if (!/\S/);
if (/^\s*([A-Z]{2})\s+[-+0-9]+\s+(\S+)/) {
my ($country, $zone) = ($1, $2);
if (exists $ZONES->{$zone}) {
$ZONES->{$zone}->{country} = $country;
} elsif (exists $LINKS->{$zone}) {
# We have a country mapping for a zone that isn't in
# our hash. This means it is a link entry. Save this
# then handle it below.
$linkEntries{$zone} = $country;
} else {
print STDERR "Nonexistent zone $zone in $FILE\n";
}
} else {
confess "Can't parse line \"$_\" of $FILE";
}
}
close(FILE);
# Now that we have mapped all of the zones in %$ZONES (except
# those without country affiliations), process the link entries.
# For those zones in the table that differ by country from their
# source zone, instantiate a new zone in the new country. An
# example is Europe/Vatican, which is linked to Europe/Rome. If
# we don't instantiate it, we have nothing for Vatican City.
# Another example is America/Shiprock, which links to
# America/Denver. These are identical and both in the US, so we
# don't instantiate America/Shiprock.
foreach my $zone (keys %linkEntries) {
my $country = $linkEntries{$zone};
my $linkZone = $LINKS->{$zone};
my $linkCountry = $ZONES->{$linkZone}->{country};
if ($linkCountry ne $country) {
# print "Cloning $zone ($country) from $linkZone ($linkCountry)\n";
_CloneZone($ZONES, $LINKS->{$zone}, $zone);
$ZONES->{$zone}->{country} = $country;
}
}
}
######################################################################
# Param: File name
# Param: Ref to hash of zones
# Param: Ref to hash of rules
# Parma: Ref to hash of links
# Param: Current year
sub ParseFile {
my ($FILE, $ZONES, $RULES, $LINKS, $YEAR) = @_;
local(*FILE);
open(FILE,"<$FILE") or confess "Can't open $FILE: $!";
my $zone; # Current zone
my $badLineCount = 0;
while (<FILE>) {
# Handle comments and blanks
s/\#.*//;
next if (!/\S/);
#|# Zone NAME GMTOFF RULES FORMAT [UNTIL]
#|Zone America/Montreal -4:54:16 - LMT 1884
#| -5:00 Mont E%sT
#|Zone America/Thunder_Bay -5:57:00 - LMT 1895
#| -5:00 Canada E%sT 1970
#| -5:00 Mont E%sT 1973
#| -5:00 - EST 1974
#| -5:00 Canada E%sT
my ($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil);
if (/^zone/i) {
# Zone block start
if (/^zone\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/i
|| /^zone\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)()/i) {
$zone = $1;
($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil) =
($2, $3, $4, $5);
} else {
print STDERR "Can't parse in $FILE: $_";
++$badLineCount;
}
} elsif (/^\s/ && $zone) {
# Zone continuation
if (/^\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/
|| /^\s+(\S+)\s+(\S+)\s+(\S+)()/) {
($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil) =
($1, $2, $3, $4);
} else {
print STDERR "Can't parse in $FILE: $_";
++$badLineCount;
}
} elsif (/^rule/i) {
# Here is where we parse a single line of the rule table.
# Our goal is to accept only rules applying to the current
# year. This is normally a matter of accepting rules
# that match the current year. However, in some cases this
# is more complicated. For example:
#|# Tonga
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
#|Rule Tonga 1999 max - Oct Sat>=1 2:00s 1:00 S
#|Rule Tonga 2000 max - Apr Sun>=16 2:00s 0 -
# To handle this properly, we save every rule we encounter
# (thus overwriting older ones with newer ones, since rules
# are listed in order), and also use slot [2] to mark when
# we see a current year rule. When that happens, we stop
# saving rules. Thus we match the latest rule we see, or
# a matching rule if we find one. The format of slot [2]
# is just a 2 bit flag ([2]&1 means slot [0] matched,
# [2]&2 means slot [1] matched).
# Note that later, when the rules are post processed
# (see Postprocess), the slot [2] will be overwritten
# with the compressed rule string used to implement
# equality testing.
$zone = undef;
# Rule
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
#|Rule US 1918 1919 - Mar lastSun 2:00 1:00 W # War
#|Rule US 1918 1919 - Oct lastSun 2:00 0 S
#|Rule US 1942 only - Feb 9 2:00 1:00 W # War
#|Rule US 1945 only - Sep 30 2:00 0 S
#|Rule US 1967 max - Oct lastSun 2:00 0 S
#|Rule US 1967 1973 - Apr lastSun 2:00 1:00 D
#|Rule US 1974 only - Jan 6 2:00 1:00 D
#|Rule US 1975 only - Feb 23 2:00 1:00 D
#|Rule US 1976 1986 - Apr lastSun 2:00 1:00 D
#|Rule US 1987 max - Apr Sun>=1 2:00 1:00 D
if (/^rule\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+
(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/xi) {
my ($name, $from, $to, $type, $in, $on, $at, $save, $letter) =
($1, $2, $3, $4, $5, $6, $7, $8, $9);
my $i = $save ? 0:1;
if (!exists $RULES->{$name}) {
$RULES->{$name} = [];
}
my $ruleArray = $RULES->{$name};
# Check our bit mask to see if we've already matched
# a current rule. If so, do nothing. If not, then
# save this rule line as the best one so far.
if (@{$ruleArray} < 3 ||
!($ruleArray->[2] & 1 << $i)) {
my $h = $ruleArray->[$i];
$ruleArray->[$i]->{from} = $from;
$ruleArray->[$i]->{to} = $to;
$ruleArray->[$i]->{type} = $type;
$ruleArray->[$i]->{in} = $in;
$ruleArray->[$i]->{on} = $on;
$ruleArray->[$i]->{at} = $at;
$ruleArray->[$i]->{save} = $save;
$ruleArray->[$i]->{letter} = $letter;
# Does this rule match the current year? If so,
# set the bit mask so we don't overwrite this rule.
# This makes us ingore rules for subsequent years
# that are already listed in the database -- as long
# as we have an overriding rule for the current year.
if (($from == $YEAR && $to =~ /only/i) ||
($from <= $YEAR &&
(($to =~ /^\d/ && $YEAR <= $to) || $to =~ /max/i))) {
$ruleArray->[2] |= 1 << $i;
$ruleArray->[3] |= 1 << $i;
}
}
} else {
print STDERR "Can't parse in $FILE: $_";
++$badLineCount;
}
} elsif (/^link/i) {
#|# Old names, for S5 users
#|
#|# Link LINK-FROM LINK-TO
#|Link America/New_York EST5EDT
#|Link America/Chicago CST6CDT
#|Link America/Denver MST7MDT
#|Link America/Los_Angeles PST8PDT
#|Link America/Indianapolis EST
#|Link America/Phoenix MST
#|Link Pacific/Honolulu HST
#
# There are also links for country-specific zones.
# These are zones the differ only in that they belong
# to a different country. E.g.,
#|Link Europe/Rome Europe/Vatican
#|Link Europe/Rome Europe/San_Marino
if (/^link\s+(\S+)\s+(\S+)/i) {
my ($from, $to) = ($1, $2);
# Record all links in $%LINKS
$LINKS->{$to} = $from;
} else {
print STDERR "Can't parse in $FILE: $_";
++$badLineCount;
}
} else {
# Unexpected line
print STDERR "Ignoring in $FILE: $_";
++$badLineCount;
}
if ($zoneRule &&
($zoneUntil !~ /\S/ || ($zoneUntil =~ /^\d/ &&
$zoneUntil >= $YEAR))) {
$ZONES->{$zone}->{gmtoff} = $zoneGmtoff;
$ZONES->{$zone}->{rule} = $zoneRule;
$ZONES->{$zone}->{format} = $zoneFormat;
$ZONES->{$zone}->{until} = $zoneUntil;
}
}
close(FILE);
}
######################################################################
# Param: Ref to hash of zones
# Param: Ref to hash of rules
sub Postprocess {
my ($ZONES, $RULES) = @_;
my %ruleInUse;
# We no longer store links in the zone hash, so we don't need to do this.
# # Eliminate zone links that have no corresponding zone
# foreach (keys %$ZONES) {
# if (exists $ZONES->{$_}->{link} && !exists $ZONES->{$_}->{rule}) {
# if (0) {
# print STDERR
# "Deleting link from historical/nonexistent zone: ",
# $_, " -> ", $ZONES->{$_}->{link}, "\n";
# }
# delete $ZONES->{$_};
# }
# }
# Check that each zone has a corresponding rule. At the same
# time, build up a hash that marks each rule that is in use.
foreach (sort keys %$ZONES) {
my $ruleName = $ZONES->{$_}->{rule};
next if ($ruleName eq $STANDARD);
if (exists $RULES->{$ruleName}) {
$ruleInUse{$ruleName} = 1;
} else {
# This means the zone is using the standard rule now
$ZONES->{$_}->{rule} = $STANDARD;
}
}
# Check that both parts are there for rules
# Check for unused rules
# Make coded string for comparisons
foreach (keys %$RULES) {
if (!exists $ruleInUse{$_}) {
if (0) {
print STDERR "Deleting historical/unused rule: $_\n";
}
delete $RULES->{$_};
} elsif (!$RULES->{$_}->[0] || !$RULES->{$_}->[1]) {
print STDERR "Rule doesn't have both parts: $_\n";
} else {
# Generate coded string
# This has all the data about a rule; it can be used
# to see if two rules behave identically
$RULES->{$_}->[2] =
lc($RULES->{$_}->[0]->{in} . "," .
$RULES->{$_}->[0]->{on} . "," .
$RULES->{$_}->[0]->{at} . "," .
$RULES->{$_}->[0]->{save} . ";" .
$RULES->{$_}->[1]->{in} . "," .
$RULES->{$_}->[1]->{on} . "," .
$RULES->{$_}->[1]->{at}); # [1]->{save} is always zero
}
}
}
######################################################################
# Create a clone of the zone $oldID named $newID in the hash $ZONES.
# Param: ref to hash of zones
# Param: ID of zone to clone
# Param: ID of new zone
sub _CloneZone {
my $ZONES = shift;
my $oldID = shift;
my $newID = shift;
for my $field (keys %{$ZONES->{$oldID}}) {
$ZONES->{$newID}->{$field} = $ZONES->{$oldID}->{$field};
}
}

View File

@ -1,238 +0,0 @@
######################################################################
# Copyright (C) 1999-2001, International Business Machines
# Corporation and others. All Rights Reserved.
######################################################################
# See: ftp://elsie.nci.nih.gov/pub/tzdata<year>
# where <year> is "1999b" or a similar string.
######################################################################
# This package contains utility functions for time zone data.
# Author: Alan Liu
######################################################################
# Zones - A time zone object is a hash with the following keys:
# {gmtoff} The offset from GMT, e.g. "-5:00"
# {rule} The name of the rule, e.g. "-", "Canada", "EU", "US"
# {format} The local abbreviation, e.g. "E%sT"
# {until} Data is good until this year, e.g., "2000". Often blank.
# These correspond to file entries:
#|# Zone NAME GMTOFF RULES FORMAT [UNTIL]
#|Zone America/Montreal -4:54:16 - LMT 1884
#| -5:00 Mont E%sT
# Links come from the file entries:
#|# Link LINK-FROM LINK-TO
#|Link America/New_York EST5EDT
#|Link America/Chicago CST6CDT
# Link data is _not_ stored in the zone hash. Instead, links are
# kept in a separate hash and resolved after all zones are defined.
# In general, we ignore links, but they provide critical data when
# generating country information.
# The name of the zone itself is not kept in the zone object.
# Instead, zones are kept in a big hash. The keys are the names; the
# values are references to the zone objects. The big hash of all
# zones is referred to in all caps: %ZONES ($ZONES if it's a
# reference).
# Example: $ZONES->{"America/Los_Angeles"} =
# 'format' => 'P%sT'
# 'gmtoff' => '-8:00'
# 'rule' => 'US'
# 'until' => ''
######################################################################
# Rules - A time zone rule is an array with the following elements:
# [0] Onset rule
# [1] Cease rule
# [2] Encoded string
# The onset rule and cease rule have the same format. They are each
# references to a hash with keys:
# {from} Start year
# {to} End year, or "only" or "max"
# {type} Unknown, usually "-"
# {in} Month, 3 letters
# {on} Day specifier, e.g. "lastSun", "Sun>=1", "23"
# {at} Time, e.g. "2:00", "1:00u"
# {save} Amount of savings, for the onset; 0 for the cease
# {letter} Guess: the letter that goes into %s in the zone {format}
# These correspond to the file entries thus:
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
#|Rule US 1942 only - Feb 9 2:00 1:00 W # War
#|Rule US 1945 only - Sep 30 2:00 0 S
#|Rule US 1967 max - Oct lastSun 2:00 0 S
#|Rule US 1967 1973 - Apr lastSun 2:00 1:00 D
#|Rule US 1974 only - Jan 6 2:00 1:00 D
#|Rule US 1975 only - Feb 23 2:00 1:00 D
#|Rule US 1976 1986 - Apr lastSun 2:00 1:00 D
#|Rule US 1987 max - Apr Sun>=1 2:00 1:00 D
# Entry [2], the encoded string, is used to see if two rules are the
# same. It consists of "[0]->{in},[0]->{on},[0]->{at},[0]->{save};
# [1]->{in},[1]->{on},[1]->{at}". Note that the separator between
# values is a comma, between onset and cease is a semicolon. Also
# note that the cease {save} is not used as this is always 0. The
# whole string is forced to lowercase.
# Rules don't contain their own name. Like zones, rules are kept in a
# big hash; the keys are the names, the values the references to the
# arrays. This hash of all rules is referred to in all caps, %RULES
# or for a reference, $RULES.
# Example: $RULES->{"US"} =
# 0 HASH(0x8fa03c)
# 'at' => '2:00'
# 'from' => 1987
# 'in' => 'Apr'
# 'letter' => 'D'
# 'on' => 'Sun>=1'
# 'save' => '1:00'
# 'to' => 'max'
# 'type' => '-'
# 1 HASH(0x8f9fc4)
# 'at' => '2:00'
# 'from' => 1967
# 'in' => 'Oct'
# 'letter' => 'S'
# 'on' => 'lastSun'
# 'save' => 0
# 'to' => 'max'
# 'type' => '-'
# 2 'apr,sun>=1,2:00,1:00;oct,lastsun,2:00'
package TZ;
use strict;
use Carp;
use vars qw(@ISA @EXPORT $VERSION $STANDARD);
require 'dumpvar.pl';
@ISA = qw(Exporter);
@EXPORT = qw(ZoneEquals
RuleEquals
ZoneCompare
RuleCompare
FormZoneEquivalencyGroups
ParseOffset
);
$VERSION = '0.1';
$STANDARD = '-'; # Name of the Standard Time rule
######################################################################
# Param: zone object (hash ref)
# Param: zone object (hash ref)
# Param: ref to hash of all rules
# Return: 0, -1, or 1
sub ZoneCompare {
my $z1 = shift;
my $z2 = shift;
my $RULES = shift;
($z1, $z2) = ($z1->{rule}, $z2->{rule});
return RuleCompare($RULES->{$z1}, $RULES->{$z2});
}
######################################################################
# Param: rule object (hash ref)
# Param: rule object (hash ref)
# Return: 0, -1, or 1
sub RuleCompare {
my $r1 = shift;
my $r2 = shift;
# Just compare the precomputed encoding strings.
# defined() catches undefined rules. The only undefined
# rule is $STANDARD; any others would be caught by
# Postprocess().
defined($r1)
? (defined($r2) ? ($r1->[2] cmp $r2->[2]) : 1)
: (defined($r2) ? -1 : 0);
# In theory, there's actually one more level of equivalency
# analysis we could do. This is to recognize that Sun >=1 is the
# same as First Sun. We don't do this yet, but it doesn't matter;
# such a date is always referred to as Sun>=1, never as firstSun.
}
######################################################################
# Param: zone object (hash ref)
# Param: zone object (hash ref)
# Param: ref to hash of all rules
# Return: true if two zones are equivalent
sub ZoneEquals {
ZoneCompare(@_) == 0;
}
######################################################################
# Param: rule object (hash ref)
# Param: rule object (hash ref)
# Return: true if two rules are equivalent
sub RuleEquals {
RuleCompare(@_) == 0;
}
######################################################################
# Given a hash of all zones and a hash of all rules, create a list
# of equivalency groups. These are groups of zones with the same
# offset and equivalent rules. Equivalency is tested with
# ZoneEquals and RuleEquals. The resultant equivalency list is an
# array of refs to groups. Each group is an array of one or more
# zone names.
# Param: IN ref to hash of all zones
# Param: IN ref to hash of all rules
# Param: OUT ref to array to receive group refs
sub FormZoneEquivalencyGroups {
my ($zones, $rules, $equiv) = @_;
# Group the zones by offset. This improves efficiency greatly;
# instead of an n^2 computation, we just need to do n^2 within
# each offset; a much smaller total number.
my %zones_by_offset;
foreach (keys %$zones) {
push @{$zones_by_offset{ParseOffset($zones->{$_}->{gmtoff})}}, $_;
}
# Find equivalent rules
foreach my $gmtoff (keys %zones_by_offset) {
# Make an array of equivalency groups
# (array of refs to array of names)
my @equiv;
foreach my $name1 (@{$zones_by_offset{$gmtoff}}) {
my $found = 0;
foreach my $group (@equiv) {
my $name2 = $group->[0];
if (ZoneEquals($zones->{$name1}, $zones->{$name2}, $rules)) {
push @$group, $name1;
$found = 1;
last;
}
}
if (!$found) {
my @newGroup = ( $name1 );
push @equiv, \@newGroup;
}
}
push @$equiv, @equiv;
}
}
######################################################################
# Parse an offset of the form d, d:dd, or d:dd:dd, or any of the above
# preceded by a '-'. Return the total number of seconds represented.
# Param: String
# Return: Integer number of seconds
sub ParseOffset {
local $_ = shift;
if (/^(-)?(\d{1,2})(:(\d\d))?(:(\d\d))?$/) {
# 1 2 3 4 5 6
my $a = (($2 * 60) + (defined $4?$4:0)) * 60 + (defined $6?$6:0);
$a = -$a if (defined $1 && $1 eq '-');
return $a;
} else {
confess "Cannot parse offset \"$_\"";
}
}