ICU-2966 Remove these out of date files. Look in icu/source/tools/gentz/tzcode for new code.
X-SVN-Rev: 13984
This commit is contained in:
parent
0fbe7eb84e
commit
a944df48d8
@ -1,114 +1,4 @@
|
||||
Copyright (C) 1999-2001, International Business Machines Corporation
|
||||
Copyright (C) 1999-2003, International Business Machines Corporation
|
||||
and others. All Rights Reserved.
|
||||
|
||||
Readme file for ICU time zone data (source/tools/gentz)
|
||||
|
||||
Alan Liu
|
||||
Last updated 2 Feb 2001
|
||||
|
||||
|
||||
RAW DATA
|
||||
--------
|
||||
The time zone data in ICU is taken from the UNIX data files at
|
||||
ftp://elsie.nci.nih.gov/pub/tzdata<year>. The other input to the
|
||||
process is an alias table, described below.
|
||||
|
||||
|
||||
BUILD PROCESS
|
||||
-------------
|
||||
Two tools are used to process the data into a format suitable for ICU:
|
||||
|
||||
tz.pl directory of raw data files + tz.alias -> tz.txt
|
||||
gentz tz.txt -> tz.dat (memory mappable binary file)
|
||||
|
||||
After gentz is run, standard ICU data tools are used to incorporate
|
||||
tz.dat into the icudata module. The tz.pl script is run manually;
|
||||
everything else is automatic.
|
||||
|
||||
In order to incorporate the raw data from that source into ICU, take
|
||||
the following steps.
|
||||
|
||||
1. Download the archive of current zone data. This should be a file
|
||||
named something like tzdata1999j.tar.gz. Use the URL listed above.
|
||||
|
||||
2. Unpack the archive into a directory, retaining the name of the
|
||||
archive. For example, unpack tzdata1999j.tar.gz into tzdata1999j/.
|
||||
Place this directory anywhere; one option is to place it within
|
||||
source/tools/gentz.
|
||||
|
||||
3. Run the perl script tz.pl, passing it the directory location as a
|
||||
command-line argument. On Windows system use the batch file
|
||||
tz.bat. Also specify one or more ourput files: .txt, .htm|.html,
|
||||
and .java.
|
||||
|
||||
For ICU4C specify .txt; typically
|
||||
|
||||
<icu>/source/data/misc/timezone.txt
|
||||
|
||||
where icu is the ICU4C root directory. Double check that this is
|
||||
the correct location and file name; they change periodically.
|
||||
|
||||
It is useful to generate an html file. After it is generated,
|
||||
review it for correctness.
|
||||
|
||||
As the third argument, pass in "tz.java". This will generate a
|
||||
java source file that will be used to update the ICU4J data.
|
||||
|
||||
4. Do a standard build. The build scripts will automatically detect
|
||||
that a new .txt file is present and rebuild the binary data (using
|
||||
gentz) from that.
|
||||
|
||||
The .txt and .htm files and typically checked into CVS, whereas
|
||||
the raw data files are not, since they are readily available from the
|
||||
URL listed above.
|
||||
|
||||
Additional steps are required to update the ICU4J data. First you
|
||||
must have a current, working installation of icu4j. These instructions
|
||||
will assume it is in directory "/icu4j".
|
||||
|
||||
5. Copy the tz.java file generated in step 3 to /icu4j/tz.java.
|
||||
|
||||
6. Change to the /icu4j directory and compile the tz.java file, with
|
||||
/icu4j/classes on the classpath.
|
||||
|
||||
7. Run the resulting java program (again with /icu4j/classes on the
|
||||
classpath) and capture the output in a file named tz.tmp.
|
||||
|
||||
8. Open /icu4j/src/com/ibm/util/TimeZoneData.java. Delete the section
|
||||
that starts with the line "BEGIN GENERATED SOURCE CODE" and ends
|
||||
with the line "END GENERATED SOURCE CODE". Replace it with the
|
||||
contents of tz.tmp. If there are extraneous control-M characters
|
||||
or other similar problems, fix them.
|
||||
|
||||
9. Rebuild icu4j and make sure there are no build errors. Rerun all
|
||||
the tests in /icu4j/src/com/ibm/test/timezone and make sure they
|
||||
all pass. If all is well, check the new TimeZoneData.java into
|
||||
CVS.
|
||||
|
||||
|
||||
ALIAS TABLE
|
||||
-----------
|
||||
For backward compatibility, we define several three-letter IDs that
|
||||
have been used since early ICU and correspond to IDs used in old JDKs.
|
||||
These IDs are listed in tz.alias. The tz.pl script processes this
|
||||
alias table and issues errors if there are problems.
|
||||
|
||||
|
||||
IDS
|
||||
---
|
||||
All *system* zone IDs must consist only of characters in the invariant
|
||||
set. See utypes.h for an explanation of what this means. If an ID is
|
||||
encountered that contains a non-invariant character, tz.pl complains.
|
||||
Non-system zones may use non-invariant characters.
|
||||
|
||||
|
||||
Etc/GMT...
|
||||
----------
|
||||
Users may be confused by the fact that various zones with names of the
|
||||
form Etc/GMT+n appear to have an offset of the wrong sign. For
|
||||
example, Etc/GMT+8 is 8 hours *behind* GMT; that is, it corresponds to
|
||||
what one typically sees displayed as "GMT-8:00". The reason for this
|
||||
inversion is explained in the UNIX zone data file "etcetera".
|
||||
Briefly, this is done intentionally in order to comply with
|
||||
POSIX-style signedness. In ICU we reproduce the UNIX zone behavior
|
||||
faithfully, including this confusing aspect.
|
||||
The gentz tool went away. Please look at icu/source/tools/gentz/tzcode/readme.txt
|
@ -1,45 +0,0 @@
|
||||
######################################################################
|
||||
# Copyright (C) 1999-2001, International Business Machines
|
||||
# Corporation and others. All Rights Reserved.
|
||||
######################################################################
|
||||
# A simple alias list. We use this to retain backward compatibility.
|
||||
# For example, ICU has always defined the zone name "PST" to indicate
|
||||
# the zone America/Los_Angeles. Unless we continue to have a zone with
|
||||
# this ID, legacy code may break.
|
||||
#
|
||||
# This list is read by tz.pl and the alias names listed here are
|
||||
# incorporated into the tz.txt file as clones.
|
||||
#
|
||||
# Format: alias_name unix_name # optional comment
|
||||
|
||||
ACT Australia/Darwin
|
||||
AET Australia/Sydney
|
||||
AGT America/Buenos_Aires
|
||||
ART Africa/Cairo
|
||||
AST America/Anchorage
|
||||
BET America/Sao_Paulo
|
||||
BST Asia/Dhaka # spelling changed in 2000h; was Asia/Dacca
|
||||
CAT Africa/Harare
|
||||
CNT America/St_Johns
|
||||
CST America/Chicago
|
||||
CTT Asia/Shanghai
|
||||
EAT Africa/Addis_Ababa
|
||||
ECT Europe/Paris
|
||||
# EET Europe/Istanbul # EET is a standard UNIX zone
|
||||
EST America/New_York # Linked to America/Indianapolis in Olson
|
||||
HST Pacific/Honolulu # Olson LINK
|
||||
IET America/Indianapolis
|
||||
IST Asia/Calcutta
|
||||
JST Asia/Tokyo
|
||||
# MET Asia/Tehran # MET is a standard UNIX zone
|
||||
MIT Pacific/Apia
|
||||
MST America/Denver # Linked to America/Phoenix in Olson
|
||||
NET Asia/Yerevan
|
||||
NST Pacific/Auckland
|
||||
PLT Asia/Karachi
|
||||
PNT America/Phoenix
|
||||
PRT America/Puerto_Rico
|
||||
PST America/Los_Angeles
|
||||
SST Pacific/Guadalcanal
|
||||
UTC Etc/UTC # Olson LINK
|
||||
VST Asia/Saigon
|
@ -1,17 +0,0 @@
|
||||
@echo off
|
||||
REM Copyright (C) 1999, International Business Machines
|
||||
REM Corporation and others. All Rights Reserved.
|
||||
|
||||
REM This script is a Windows launcher for the tz.pl script. For this
|
||||
REM to work, the perl executable must be on the path. We recommend
|
||||
REM the ActiveState build; see http://www.activestate.com. See the
|
||||
REM tz.pl script itself for more documentation.
|
||||
|
||||
if "%OS%" == "Windows_NT" goto WinNT
|
||||
perl -w -x -S "tz.pl" %1 %2 %3 %4 %5 %6 %7 %8 %9
|
||||
goto end
|
||||
:WinNT
|
||||
perl -w -x -S "tz.pl" %*
|
||||
if NOT "%COMSPEC%" == "%SystemRoot%\system32\cmd.exe" goto end
|
||||
if %errorlevel% == 9009 echo You do not have Perl in your PATH.
|
||||
:end
|
@ -1,42 +0,0 @@
|
||||
######################################################################
|
||||
# Copyright (C) 1999-2003, International Business Machines
|
||||
# Corporation and others. All Rights Reserved.
|
||||
######################################################################
|
||||
# Default zone list. If ICU cannot find an exact match for the host
|
||||
# time zone, it picks a zone that matches the host offset. There may
|
||||
# be many such zones, however, so it must know which one to select.
|
||||
#
|
||||
# This list is read by tz.pl and the default names listed here are
|
||||
# incorporated into the tz.txt file as preferred default zones.
|
||||
# Any conflicts (multiple defaults for the same offset) or absences
|
||||
# (no defaults specified for an offset) are reported.
|
||||
#
|
||||
# Format: default_name # optional comment
|
||||
|
||||
Africa/Addis_Ababa
|
||||
Africa/Cairo
|
||||
America/Anchorage
|
||||
America/Buenos_Aires
|
||||
America/Chicago
|
||||
America/Denver
|
||||
America/New_York
|
||||
America/Los_Angeles
|
||||
America/Puerto_Rico
|
||||
America/St_Johns
|
||||
Asia/Calcutta
|
||||
Asia/Dhaka # spelling changed in 2000h; was Asia/Dacca
|
||||
Asia/Karachi
|
||||
Asia/Riyadh89 # Pick the chronologically lastest of this group
|
||||
Asia/Saigon
|
||||
Asia/Shanghai
|
||||
Asia/Tokyo
|
||||
Asia/Yerevan
|
||||
Atlantic/Azores # Windows lists Azores, Cape Verde
|
||||
Australia/Darwin
|
||||
Australia/Sydney
|
||||
Europe/Paris
|
||||
GMT
|
||||
Pacific/Apia
|
||||
Pacific/Auckland
|
||||
Pacific/Guadalcanal
|
||||
Pacific/Honolulu
|
File diff suppressed because it is too large
Load Diff
@ -1,328 +0,0 @@
|
||||
######################################################################
|
||||
# Copyright (C) 1999-2001, International Business Machines
|
||||
# Corporation and others. All Rights Reserved.
|
||||
######################################################################
|
||||
# See: ftp://elsie.nci.nih.gov/pub/tzdata<year>
|
||||
# where <year> is "1999b" or a similar string.
|
||||
######################################################################
|
||||
# This package handles the parsing of time zone files.
|
||||
# Author: Alan Liu
|
||||
######################################################################
|
||||
# Usage:
|
||||
# Call ParseFile for each file to be imported. Then call ParseZoneTab
|
||||
# to add country data. Then call Postprocess to remove unused rules.
|
||||
|
||||
package TZ;
|
||||
use strict;
|
||||
use Carp;
|
||||
use vars qw(@ISA @EXPORT $VERSION $YEAR $STANDARD);
|
||||
require 'dumpvar.pl';
|
||||
|
||||
@ISA = qw(Exporter);
|
||||
@EXPORT = qw(ParseFile
|
||||
Postprocess
|
||||
ParseZoneTab
|
||||
);
|
||||
$VERSION = '0.2';
|
||||
|
||||
$STANDARD = '-'; # Name of the Standard Time rule
|
||||
|
||||
######################################################################
|
||||
# Read the tzdata zone.tab file and add a {country} field to zones
|
||||
# in the given hash.
|
||||
# Param: File name (<dir>/zone.tab)
|
||||
# Param: Ref to hash of zones
|
||||
# Param: Ref to hash of links
|
||||
sub ParseZoneTab {
|
||||
my ($FILE, $ZONES, $LINKS) = @_;
|
||||
|
||||
my %linkEntries;
|
||||
|
||||
local(*FILE);
|
||||
open(FILE,"<$FILE") or confess "Can't open $FILE: $!";
|
||||
while (<FILE>) {
|
||||
# Handle comments
|
||||
s/\#.*//;
|
||||
next if (!/\S/);
|
||||
|
||||
if (/^\s*([A-Z]{2})\s+[-+0-9]+\s+(\S+)/) {
|
||||
my ($country, $zone) = ($1, $2);
|
||||
if (exists $ZONES->{$zone}) {
|
||||
$ZONES->{$zone}->{country} = $country;
|
||||
} elsif (exists $LINKS->{$zone}) {
|
||||
# We have a country mapping for a zone that isn't in
|
||||
# our hash. This means it is a link entry. Save this
|
||||
# then handle it below.
|
||||
$linkEntries{$zone} = $country;
|
||||
} else {
|
||||
print STDERR "Nonexistent zone $zone in $FILE\n";
|
||||
}
|
||||
} else {
|
||||
confess "Can't parse line \"$_\" of $FILE";
|
||||
}
|
||||
}
|
||||
close(FILE);
|
||||
|
||||
# Now that we have mapped all of the zones in %$ZONES (except
|
||||
# those without country affiliations), process the link entries.
|
||||
# For those zones in the table that differ by country from their
|
||||
# source zone, instantiate a new zone in the new country. An
|
||||
# example is Europe/Vatican, which is linked to Europe/Rome. If
|
||||
# we don't instantiate it, we have nothing for Vatican City.
|
||||
# Another example is America/Shiprock, which links to
|
||||
# America/Denver. These are identical and both in the US, so we
|
||||
# don't instantiate America/Shiprock.
|
||||
foreach my $zone (keys %linkEntries) {
|
||||
my $country = $linkEntries{$zone};
|
||||
my $linkZone = $LINKS->{$zone};
|
||||
my $linkCountry = $ZONES->{$linkZone}->{country};
|
||||
if ($linkCountry ne $country) {
|
||||
# print "Cloning $zone ($country) from $linkZone ($linkCountry)\n";
|
||||
_CloneZone($ZONES, $LINKS->{$zone}, $zone);
|
||||
$ZONES->{$zone}->{country} = $country;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Param: File name
|
||||
# Param: Ref to hash of zones
|
||||
# Param: Ref to hash of rules
|
||||
# Parma: Ref to hash of links
|
||||
# Param: Current year
|
||||
sub ParseFile {
|
||||
my ($FILE, $ZONES, $RULES, $LINKS, $YEAR) = @_;
|
||||
|
||||
local(*FILE);
|
||||
open(FILE,"<$FILE") or confess "Can't open $FILE: $!";
|
||||
my $zone; # Current zone
|
||||
my $badLineCount = 0;
|
||||
while (<FILE>) {
|
||||
# Handle comments and blanks
|
||||
s/\#.*//;
|
||||
next if (!/\S/);
|
||||
|
||||
#|# Zone NAME GMTOFF RULES FORMAT [UNTIL]
|
||||
#|Zone America/Montreal -4:54:16 - LMT 1884
|
||||
#| -5:00 Mont E%sT
|
||||
#|Zone America/Thunder_Bay -5:57:00 - LMT 1895
|
||||
#| -5:00 Canada E%sT 1970
|
||||
#| -5:00 Mont E%sT 1973
|
||||
#| -5:00 - EST 1974
|
||||
#| -5:00 Canada E%sT
|
||||
my ($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil);
|
||||
if (/^zone/i) {
|
||||
# Zone block start
|
||||
if (/^zone\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/i
|
||||
|| /^zone\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)()/i) {
|
||||
$zone = $1;
|
||||
($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil) =
|
||||
($2, $3, $4, $5);
|
||||
} else {
|
||||
print STDERR "Can't parse in $FILE: $_";
|
||||
++$badLineCount;
|
||||
}
|
||||
} elsif (/^\s/ && $zone) {
|
||||
# Zone continuation
|
||||
if (/^\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/
|
||||
|| /^\s+(\S+)\s+(\S+)\s+(\S+)()/) {
|
||||
($zoneGmtoff, $zoneRule, $zoneFormat, $zoneUntil) =
|
||||
($1, $2, $3, $4);
|
||||
} else {
|
||||
print STDERR "Can't parse in $FILE: $_";
|
||||
++$badLineCount;
|
||||
}
|
||||
} elsif (/^rule/i) {
|
||||
# Here is where we parse a single line of the rule table.
|
||||
# Our goal is to accept only rules applying to the current
|
||||
# year. This is normally a matter of accepting rules
|
||||
# that match the current year. However, in some cases this
|
||||
# is more complicated. For example:
|
||||
#|# Tonga
|
||||
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
|
||||
#|Rule Tonga 1999 max - Oct Sat>=1 2:00s 1:00 S
|
||||
#|Rule Tonga 2000 max - Apr Sun>=16 2:00s 0 -
|
||||
# To handle this properly, we save every rule we encounter
|
||||
# (thus overwriting older ones with newer ones, since rules
|
||||
# are listed in order), and also use slot [2] to mark when
|
||||
# we see a current year rule. When that happens, we stop
|
||||
# saving rules. Thus we match the latest rule we see, or
|
||||
# a matching rule if we find one. The format of slot [2]
|
||||
# is just a 2 bit flag ([2]&1 means slot [0] matched,
|
||||
# [2]&2 means slot [1] matched).
|
||||
|
||||
# Note that later, when the rules are post processed
|
||||
# (see Postprocess), the slot [2] will be overwritten
|
||||
# with the compressed rule string used to implement
|
||||
# equality testing.
|
||||
|
||||
$zone = undef;
|
||||
# Rule
|
||||
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
|
||||
#|Rule US 1918 1919 - Mar lastSun 2:00 1:00 W # War
|
||||
#|Rule US 1918 1919 - Oct lastSun 2:00 0 S
|
||||
#|Rule US 1942 only - Feb 9 2:00 1:00 W # War
|
||||
#|Rule US 1945 only - Sep 30 2:00 0 S
|
||||
#|Rule US 1967 max - Oct lastSun 2:00 0 S
|
||||
#|Rule US 1967 1973 - Apr lastSun 2:00 1:00 D
|
||||
#|Rule US 1974 only - Jan 6 2:00 1:00 D
|
||||
#|Rule US 1975 only - Feb 23 2:00 1:00 D
|
||||
#|Rule US 1976 1986 - Apr lastSun 2:00 1:00 D
|
||||
#|Rule US 1987 max - Apr Sun>=1 2:00 1:00 D
|
||||
if (/^rule\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+
|
||||
(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/xi) {
|
||||
my ($name, $from, $to, $type, $in, $on, $at, $save, $letter) =
|
||||
($1, $2, $3, $4, $5, $6, $7, $8, $9);
|
||||
my $i = $save ? 0:1;
|
||||
|
||||
if (!exists $RULES->{$name}) {
|
||||
$RULES->{$name} = [];
|
||||
}
|
||||
my $ruleArray = $RULES->{$name};
|
||||
|
||||
# Check our bit mask to see if we've already matched
|
||||
# a current rule. If so, do nothing. If not, then
|
||||
# save this rule line as the best one so far.
|
||||
if (@{$ruleArray} < 3 ||
|
||||
!($ruleArray->[2] & 1 << $i)) {
|
||||
my $h = $ruleArray->[$i];
|
||||
$ruleArray->[$i]->{from} = $from;
|
||||
$ruleArray->[$i]->{to} = $to;
|
||||
$ruleArray->[$i]->{type} = $type;
|
||||
$ruleArray->[$i]->{in} = $in;
|
||||
$ruleArray->[$i]->{on} = $on;
|
||||
$ruleArray->[$i]->{at} = $at;
|
||||
$ruleArray->[$i]->{save} = $save;
|
||||
$ruleArray->[$i]->{letter} = $letter;
|
||||
|
||||
# Does this rule match the current year? If so,
|
||||
# set the bit mask so we don't overwrite this rule.
|
||||
# This makes us ingore rules for subsequent years
|
||||
# that are already listed in the database -- as long
|
||||
# as we have an overriding rule for the current year.
|
||||
if (($from == $YEAR && $to =~ /only/i) ||
|
||||
($from <= $YEAR &&
|
||||
(($to =~ /^\d/ && $YEAR <= $to) || $to =~ /max/i))) {
|
||||
$ruleArray->[2] |= 1 << $i;
|
||||
$ruleArray->[3] |= 1 << $i;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
print STDERR "Can't parse in $FILE: $_";
|
||||
++$badLineCount;
|
||||
}
|
||||
} elsif (/^link/i) {
|
||||
#|# Old names, for S5 users
|
||||
#|
|
||||
#|# Link LINK-FROM LINK-TO
|
||||
#|Link America/New_York EST5EDT
|
||||
#|Link America/Chicago CST6CDT
|
||||
#|Link America/Denver MST7MDT
|
||||
#|Link America/Los_Angeles PST8PDT
|
||||
#|Link America/Indianapolis EST
|
||||
#|Link America/Phoenix MST
|
||||
#|Link Pacific/Honolulu HST
|
||||
#
|
||||
# There are also links for country-specific zones.
|
||||
# These are zones the differ only in that they belong
|
||||
# to a different country. E.g.,
|
||||
#|Link Europe/Rome Europe/Vatican
|
||||
#|Link Europe/Rome Europe/San_Marino
|
||||
if (/^link\s+(\S+)\s+(\S+)/i) {
|
||||
my ($from, $to) = ($1, $2);
|
||||
# Record all links in $%LINKS
|
||||
$LINKS->{$to} = $from;
|
||||
} else {
|
||||
print STDERR "Can't parse in $FILE: $_";
|
||||
++$badLineCount;
|
||||
}
|
||||
} else {
|
||||
# Unexpected line
|
||||
print STDERR "Ignoring in $FILE: $_";
|
||||
++$badLineCount;
|
||||
}
|
||||
if ($zoneRule &&
|
||||
($zoneUntil !~ /\S/ || ($zoneUntil =~ /^\d/ &&
|
||||
$zoneUntil >= $YEAR))) {
|
||||
$ZONES->{$zone}->{gmtoff} = $zoneGmtoff;
|
||||
$ZONES->{$zone}->{rule} = $zoneRule;
|
||||
$ZONES->{$zone}->{format} = $zoneFormat;
|
||||
$ZONES->{$zone}->{until} = $zoneUntil;
|
||||
}
|
||||
}
|
||||
close(FILE);
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Param: Ref to hash of zones
|
||||
# Param: Ref to hash of rules
|
||||
sub Postprocess {
|
||||
my ($ZONES, $RULES) = @_;
|
||||
my %ruleInUse;
|
||||
|
||||
# We no longer store links in the zone hash, so we don't need to do this.
|
||||
# # Eliminate zone links that have no corresponding zone
|
||||
# foreach (keys %$ZONES) {
|
||||
# if (exists $ZONES->{$_}->{link} && !exists $ZONES->{$_}->{rule}) {
|
||||
# if (0) {
|
||||
# print STDERR
|
||||
# "Deleting link from historical/nonexistent zone: ",
|
||||
# $_, " -> ", $ZONES->{$_}->{link}, "\n";
|
||||
# }
|
||||
# delete $ZONES->{$_};
|
||||
# }
|
||||
# }
|
||||
|
||||
# Check that each zone has a corresponding rule. At the same
|
||||
# time, build up a hash that marks each rule that is in use.
|
||||
foreach (sort keys %$ZONES) {
|
||||
my $ruleName = $ZONES->{$_}->{rule};
|
||||
next if ($ruleName eq $STANDARD);
|
||||
if (exists $RULES->{$ruleName}) {
|
||||
$ruleInUse{$ruleName} = 1;
|
||||
} else {
|
||||
# This means the zone is using the standard rule now
|
||||
$ZONES->{$_}->{rule} = $STANDARD;
|
||||
}
|
||||
}
|
||||
|
||||
# Check that both parts are there for rules
|
||||
# Check for unused rules
|
||||
# Make coded string for comparisons
|
||||
foreach (keys %$RULES) {
|
||||
if (!exists $ruleInUse{$_}) {
|
||||
if (0) {
|
||||
print STDERR "Deleting historical/unused rule: $_\n";
|
||||
}
|
||||
delete $RULES->{$_};
|
||||
} elsif (!$RULES->{$_}->[0] || !$RULES->{$_}->[1]) {
|
||||
print STDERR "Rule doesn't have both parts: $_\n";
|
||||
} else {
|
||||
# Generate coded string
|
||||
# This has all the data about a rule; it can be used
|
||||
# to see if two rules behave identically
|
||||
$RULES->{$_}->[2] =
|
||||
lc($RULES->{$_}->[0]->{in} . "," .
|
||||
$RULES->{$_}->[0]->{on} . "," .
|
||||
$RULES->{$_}->[0]->{at} . "," .
|
||||
$RULES->{$_}->[0]->{save} . ";" .
|
||||
$RULES->{$_}->[1]->{in} . "," .
|
||||
$RULES->{$_}->[1]->{on} . "," .
|
||||
$RULES->{$_}->[1]->{at}); # [1]->{save} is always zero
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Create a clone of the zone $oldID named $newID in the hash $ZONES.
|
||||
# Param: ref to hash of zones
|
||||
# Param: ID of zone to clone
|
||||
# Param: ID of new zone
|
||||
sub _CloneZone {
|
||||
my $ZONES = shift;
|
||||
my $oldID = shift;
|
||||
my $newID = shift;
|
||||
for my $field (keys %{$ZONES->{$oldID}}) {
|
||||
$ZONES->{$newID}->{$field} = $ZONES->{$oldID}->{$field};
|
||||
}
|
||||
}
|
@ -1,238 +0,0 @@
|
||||
######################################################################
|
||||
# Copyright (C) 1999-2001, International Business Machines
|
||||
# Corporation and others. All Rights Reserved.
|
||||
######################################################################
|
||||
# See: ftp://elsie.nci.nih.gov/pub/tzdata<year>
|
||||
# where <year> is "1999b" or a similar string.
|
||||
######################################################################
|
||||
# This package contains utility functions for time zone data.
|
||||
# Author: Alan Liu
|
||||
|
||||
######################################################################
|
||||
# Zones - A time zone object is a hash with the following keys:
|
||||
# {gmtoff} The offset from GMT, e.g. "-5:00"
|
||||
# {rule} The name of the rule, e.g. "-", "Canada", "EU", "US"
|
||||
# {format} The local abbreviation, e.g. "E%sT"
|
||||
# {until} Data is good until this year, e.g., "2000". Often blank.
|
||||
|
||||
# These correspond to file entries:
|
||||
#|# Zone NAME GMTOFF RULES FORMAT [UNTIL]
|
||||
#|Zone America/Montreal -4:54:16 - LMT 1884
|
||||
#| -5:00 Mont E%sT
|
||||
|
||||
# Links come from the file entries:
|
||||
#|# Link LINK-FROM LINK-TO
|
||||
#|Link America/New_York EST5EDT
|
||||
#|Link America/Chicago CST6CDT
|
||||
# Link data is _not_ stored in the zone hash. Instead, links are
|
||||
# kept in a separate hash and resolved after all zones are defined.
|
||||
# In general, we ignore links, but they provide critical data when
|
||||
# generating country information.
|
||||
|
||||
# The name of the zone itself is not kept in the zone object.
|
||||
# Instead, zones are kept in a big hash. The keys are the names; the
|
||||
# values are references to the zone objects. The big hash of all
|
||||
# zones is referred to in all caps: %ZONES ($ZONES if it's a
|
||||
# reference).
|
||||
|
||||
# Example: $ZONES->{"America/Los_Angeles"} =
|
||||
# 'format' => 'P%sT'
|
||||
# 'gmtoff' => '-8:00'
|
||||
# 'rule' => 'US'
|
||||
# 'until' => ''
|
||||
|
||||
######################################################################
|
||||
# Rules - A time zone rule is an array with the following elements:
|
||||
# [0] Onset rule
|
||||
# [1] Cease rule
|
||||
# [2] Encoded string
|
||||
|
||||
# The onset rule and cease rule have the same format. They are each
|
||||
# references to a hash with keys:
|
||||
# {from} Start year
|
||||
# {to} End year, or "only" or "max"
|
||||
# {type} Unknown, usually "-"
|
||||
# {in} Month, 3 letters
|
||||
# {on} Day specifier, e.g. "lastSun", "Sun>=1", "23"
|
||||
# {at} Time, e.g. "2:00", "1:00u"
|
||||
# {save} Amount of savings, for the onset; 0 for the cease
|
||||
# {letter} Guess: the letter that goes into %s in the zone {format}
|
||||
|
||||
# These correspond to the file entries thus:
|
||||
#|# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
|
||||
#|Rule US 1942 only - Feb 9 2:00 1:00 W # War
|
||||
#|Rule US 1945 only - Sep 30 2:00 0 S
|
||||
#|Rule US 1967 max - Oct lastSun 2:00 0 S
|
||||
#|Rule US 1967 1973 - Apr lastSun 2:00 1:00 D
|
||||
#|Rule US 1974 only - Jan 6 2:00 1:00 D
|
||||
#|Rule US 1975 only - Feb 23 2:00 1:00 D
|
||||
#|Rule US 1976 1986 - Apr lastSun 2:00 1:00 D
|
||||
#|Rule US 1987 max - Apr Sun>=1 2:00 1:00 D
|
||||
|
||||
# Entry [2], the encoded string, is used to see if two rules are the
|
||||
# same. It consists of "[0]->{in},[0]->{on},[0]->{at},[0]->{save};
|
||||
# [1]->{in},[1]->{on},[1]->{at}". Note that the separator between
|
||||
# values is a comma, between onset and cease is a semicolon. Also
|
||||
# note that the cease {save} is not used as this is always 0. The
|
||||
# whole string is forced to lowercase.
|
||||
|
||||
# Rules don't contain their own name. Like zones, rules are kept in a
|
||||
# big hash; the keys are the names, the values the references to the
|
||||
# arrays. This hash of all rules is referred to in all caps, %RULES
|
||||
# or for a reference, $RULES.
|
||||
|
||||
# Example: $RULES->{"US"} =
|
||||
# 0 HASH(0x8fa03c)
|
||||
# 'at' => '2:00'
|
||||
# 'from' => 1987
|
||||
# 'in' => 'Apr'
|
||||
# 'letter' => 'D'
|
||||
# 'on' => 'Sun>=1'
|
||||
# 'save' => '1:00'
|
||||
# 'to' => 'max'
|
||||
# 'type' => '-'
|
||||
# 1 HASH(0x8f9fc4)
|
||||
# 'at' => '2:00'
|
||||
# 'from' => 1967
|
||||
# 'in' => 'Oct'
|
||||
# 'letter' => 'S'
|
||||
# 'on' => 'lastSun'
|
||||
# 'save' => 0
|
||||
# 'to' => 'max'
|
||||
# 'type' => '-'
|
||||
# 2 'apr,sun>=1,2:00,1:00;oct,lastsun,2:00'
|
||||
|
||||
package TZ;
|
||||
use strict;
|
||||
use Carp;
|
||||
use vars qw(@ISA @EXPORT $VERSION $STANDARD);
|
||||
require 'dumpvar.pl';
|
||||
|
||||
@ISA = qw(Exporter);
|
||||
@EXPORT = qw(ZoneEquals
|
||||
RuleEquals
|
||||
ZoneCompare
|
||||
RuleCompare
|
||||
FormZoneEquivalencyGroups
|
||||
ParseOffset
|
||||
);
|
||||
$VERSION = '0.1';
|
||||
|
||||
$STANDARD = '-'; # Name of the Standard Time rule
|
||||
|
||||
######################################################################
|
||||
# Param: zone object (hash ref)
|
||||
# Param: zone object (hash ref)
|
||||
# Param: ref to hash of all rules
|
||||
# Return: 0, -1, or 1
|
||||
sub ZoneCompare {
|
||||
my $z1 = shift;
|
||||
my $z2 = shift;
|
||||
my $RULES = shift;
|
||||
|
||||
($z1, $z2) = ($z1->{rule}, $z2->{rule});
|
||||
|
||||
return RuleCompare($RULES->{$z1}, $RULES->{$z2});
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Param: rule object (hash ref)
|
||||
# Param: rule object (hash ref)
|
||||
# Return: 0, -1, or 1
|
||||
sub RuleCompare {
|
||||
my $r1 = shift;
|
||||
my $r2 = shift;
|
||||
|
||||
# Just compare the precomputed encoding strings.
|
||||
# defined() catches undefined rules. The only undefined
|
||||
# rule is $STANDARD; any others would be caught by
|
||||
# Postprocess().
|
||||
|
||||
defined($r1)
|
||||
? (defined($r2) ? ($r1->[2] cmp $r2->[2]) : 1)
|
||||
: (defined($r2) ? -1 : 0);
|
||||
|
||||
# In theory, there's actually one more level of equivalency
|
||||
# analysis we could do. This is to recognize that Sun >=1 is the
|
||||
# same as First Sun. We don't do this yet, but it doesn't matter;
|
||||
# such a date is always referred to as Sun>=1, never as firstSun.
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Param: zone object (hash ref)
|
||||
# Param: zone object (hash ref)
|
||||
# Param: ref to hash of all rules
|
||||
# Return: true if two zones are equivalent
|
||||
sub ZoneEquals {
|
||||
ZoneCompare(@_) == 0;
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Param: rule object (hash ref)
|
||||
# Param: rule object (hash ref)
|
||||
# Return: true if two rules are equivalent
|
||||
sub RuleEquals {
|
||||
RuleCompare(@_) == 0;
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Given a hash of all zones and a hash of all rules, create a list
|
||||
# of equivalency groups. These are groups of zones with the same
|
||||
# offset and equivalent rules. Equivalency is tested with
|
||||
# ZoneEquals and RuleEquals. The resultant equivalency list is an
|
||||
# array of refs to groups. Each group is an array of one or more
|
||||
# zone names.
|
||||
# Param: IN ref to hash of all zones
|
||||
# Param: IN ref to hash of all rules
|
||||
# Param: OUT ref to array to receive group refs
|
||||
sub FormZoneEquivalencyGroups {
|
||||
my ($zones, $rules, $equiv) = @_;
|
||||
|
||||
# Group the zones by offset. This improves efficiency greatly;
|
||||
# instead of an n^2 computation, we just need to do n^2 within
|
||||
# each offset; a much smaller total number.
|
||||
my %zones_by_offset;
|
||||
foreach (keys %$zones) {
|
||||
push @{$zones_by_offset{ParseOffset($zones->{$_}->{gmtoff})}}, $_;
|
||||
}
|
||||
|
||||
# Find equivalent rules
|
||||
foreach my $gmtoff (keys %zones_by_offset) {
|
||||
# Make an array of equivalency groups
|
||||
# (array of refs to array of names)
|
||||
my @equiv;
|
||||
foreach my $name1 (@{$zones_by_offset{$gmtoff}}) {
|
||||
my $found = 0;
|
||||
foreach my $group (@equiv) {
|
||||
my $name2 = $group->[0];
|
||||
if (ZoneEquals($zones->{$name1}, $zones->{$name2}, $rules)) {
|
||||
push @$group, $name1;
|
||||
$found = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
if (!$found) {
|
||||
my @newGroup = ( $name1 );
|
||||
push @equiv, \@newGroup;
|
||||
}
|
||||
}
|
||||
push @$equiv, @equiv;
|
||||
}
|
||||
}
|
||||
|
||||
######################################################################
|
||||
# Parse an offset of the form d, d:dd, or d:dd:dd, or any of the above
|
||||
# preceded by a '-'. Return the total number of seconds represented.
|
||||
# Param: String
|
||||
# Return: Integer number of seconds
|
||||
sub ParseOffset {
|
||||
local $_ = shift;
|
||||
if (/^(-)?(\d{1,2})(:(\d\d))?(:(\d\d))?$/) {
|
||||
# 1 2 3 4 5 6
|
||||
my $a = (($2 * 60) + (defined $4?$4:0)) * 60 + (defined $6?$6:0);
|
||||
$a = -$a if (defined $1 && $1 eq '-');
|
||||
return $a;
|
||||
} else {
|
||||
confess "Cannot parse offset \"$_\"";
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue
Block a user