IRC log of #zope3-dev for Tuesday, 2005-04-05

zagy: hi
*** Theuni has quit IRC10:37
tarek_: ayt ?
srichter: I will be online in about 3 hours
tarek_: ok
tarek_: ok
*** srichter has joined #zope3-dev17:07
philiKON: moin srichter
tarek_: hi srichter
srichter: hi
tarek_i was wondering something about
tarek_is there any other utility in zope 3 than DateTimeParser that can be used to normalize dates17:13
srichtertarek_: noone should use datetimeutils :-)17:14
tarek_it works fine except for some extended times17:14
srichterI really should propose to deprecate it :-)17:14
srichterwhat you should use is the datetime parser in zope.i18n17:15
tarek_i am using it to parse mail dates17:15
tarek_ok thanks17:15
srichterit is pattern-based, like you are used to it in Excel/Calc for example17:15
*** tvon has joined #zope3-dev17:16
srichtertarek_: the zope.i18n code is not as userfriendly, but it does not guess17:16
srichterwhich is a good thing17:16
srichterit does what you tell it17:17
bska|mobiletarek_: pytz has a normalize routine17:17
tarek_my use case is to try to guess the date out of any mail date string17:17
tarek_like datetimeparser does17:17
srichterif you want to be able to detect several different formats, you have to write a high-level function that tries several patterns17:17
tarek_ok, in my test cases, DateTimeParser passed all weird forms except the one with extra info like : Tue,  5 Apr 2005 11:33:39 +0200 (CEST)17:18
srichterthe pattern should be able to handle this17:19
srichterbut how does datetimeparser interpret: 01/01/0117:19
srichterfor example?17:19
srichterthis is the problem with guessing17:20
srichterthe pattern approach allows you to be deterministic about the date17:20
tarek_i'll have to look, but running on a real mailbox (+10k mails and a bunch of spams) -> i just had the CEST failure17:20
*** MohsenY has joined #zope3-dev17:21
srichterwell, it would not fail17:21
srichterbut the US starts using MM/DD/YY17:21
*** mohsenX has quit IRC17:21
srichterand in Germany it is DD/MM/YY17:21
tarek_yeah, but actually it's always US style in mails i think17:22
srichtersame goes for 01-01-0117:22
srichterthe US uses it as MM-DD-YY17:22
srichterthe internal interpretation is YY-MM-DD17:22
* tarek_ is in France == like germany17:22
tarek_ok so I delete :)17:24
srichteryou do not have to, but I would :-)17:25
tarek_it's gone :)17:26
srichterbtw, if you develop a high-level datetime parsing function atop the zope.i18n code, then this is probably something you could contribute to the core17:27
srichterI imagine a lot of people would want this17:27
philiKONyou mean datetime parsing basing on locale?17:27
tarek_email is a good test case17:27
srichterbtw, eventually I will also support fully localized naming, such as17:27
tarek_because you get dates from all locals17:27
srichterLundi, Avril 4, 200517:28
philiKONtarek_, do you? doesn't the mime standard define what date format you have to use?17:28
philiKONsrichter, i thought we have that already17:28
srichterphiliKON: currently it does not use the localized names I think17:28
srichterlet me check17:28
tarek_philiKON: let me check what's fetched from imap17:29
srichterno, it does17:29
srichteryou are right17:29
srichterbut, it only supports the names for the current locale17:30
srichterso thi smight be an issue in something like E-mail, because the client might be set to French, but the date is in English; so currently you would have to look up the English locale manually; but maybe this is the right thing to do17:31
philiKONyou can get any locale by zope.i18n.locale.getLocale17:31
srichterof course17:31
philiKONi still don't understand the problem with emails17:31
philiKONdoesn't the MIME standard define what format the Date: header should be in?!?17:31
* srichter thinks he should sound much more authoritative on this subject as he wrote the code17:31
tarek_philiKON: when you get an email, you get a Date header  and juste a Mime version17:33
philiKONand the Date: header is in what date format? ISO?17:34
* tarek_ is looking 17:35
tarek_it can be in several shapes indeed17:35
*** sashav has quit IRC17:37
* tarek_ is comparing python/email/tests/ cases with his real cases17:37
tarek_over 10k mails i had a limited number of patterns in fact17:39
tarek_Wed,3 Apr 2002 14:58:26 +0800 : 99% of mails17:39
tarek_3 Apr 2002 14:58:26 +0800 : 0.9 % of mails17:40
tarek_Wed,3 Apr 2002 14:58:26 +08000  (CEST) : spams,and buggdy mails17:40
tarek_but all are in english17:40
srichterthat should be easy for a pattern parser as you have only 3 effective patterns17:42
srichterand since the last one could be ambiguous, I would suggest clipping off the (xxx) before sending it to the parser17:42
tarek_that what i did with DateTimeParser yes17:43
srichterso I think you are in good shape of using zope.i18n17:43
tarek_yup :)17:44
tarek_thanks for the help srichter, philiKON, bska|mobile17:44
regebroA note: I think the fact that parsing incorrect date-formats fails is good. It sorts out a lot of spam.17:46
regebroAnd on a connected topic: Any *sorting* should be done on the date the server *received* the mail, as you otherwise lose mails from people who have the wrong date set on their computer.17:48
tarek_regebro: yes indeed, they all get 01/01/1970 and that's a good sorting17:48
regebroBut, you still need to display the send-dat, of course.17:48
tarek_but for instance regebro, all our mailing lists are adding the (CEST) stuff17:49
tarek_i mean17:49
tarek_all our CVS things17:49
regebroHmmm. I don't remember the specs, but doesn't that break them?17:50
tarek_yes it does17:51
*** tvon has joined #zope3-dev17:52
tarek_but thunderbird eat them so i have to :)17:52
regebroWell, that should be an easy fix of the CVS mails in that case: Don't add the date header. Most sendmails and qmails will add them (correctly) if they are missing.17:53
regebrogood point17:53
bska|mobilesrichter: looking a zope.i18n.locales data/en.xml it shows the pattern for 'long' as h:mm:ss a z18:46
bska|mobileI thought the 'a' would represent am/pm18:47
bska|mobilebut it doesn't seem to18:47
srichterthat would surprise me18:47
bska|mobilethat is what its supposed to do?18:48
srichterI am looking at the docks18:48
bska|mobile08:00:00 +000 is what it renders18:48
srichtera should be the am/pm marker based on the interface18:49
srichteroh, the am/pm marker of the locale you chose might be empty?18:49
bska|mobilehrm, maybe18:50
bska|mobileits the request locale from my browser, I'll check18:50
srichtermmh, it sdhould have it18:50
srichtercan you reproduce the problem in a test?18:51
bska|mobileI'll try, let you know when I do18:51
srichterthe code clearly has:"18:52
srichter    # am/pm marker (Text)18:52
srichter    for entry in _findFormattingCharacterInPattern('a', pattern):18:52
srichter        info[entry] = ampm18:52
srichterwhich means that any pattern part containing any amount of 'a' should be the ampm marker18:53
srichterbtwm I also have plenty of tests with the am/pm marker:  testFormatSimpleHourRepresentation18:55
*** hazmat has joined #zope3-dev19:50
srichterAJC: I think I cover this in my book22:29
srichterbut basically it is22:29
srichterdc = IDublinCore(obj)22:29
AJCah, cool.  what's the title of your book? :-)22:36
srichterZope 3 Developer's Handbook22:40
AJCnice review ;)22:42
srichteryeah, garrett was very nice :-)22:42
