← Previous day | Today | Next day → | Search | Index
All times shown according to UTC.
| Time | Nick | Message | 
|---|---|---|
| 11:00 | pierrick_ | I don't know exactly what was hdl problem, but I'm working on utf-8 handling for 3.0. I already encounter a problem with 2.2 : when I search '%ère%' (like in 'ministère'), it also returns things like 'Simonne CAILLERE' | 
| 11:01 | paul | that's not a bug, that's a feature ! | 
| 11:01 | pierrick_ | this is quit smart from MySQL, but this is not what I'm waiting for | 
| 11:01 | kados | hehe | 
| 11:01 | paul | thus, you need a mySQL collection other than utf8_unicode_ci | 
| 11:01 | hdl | but this is also what librarian wait for. | 
| 11:01 | paul | something like unicode_bin should be your friend ! | 
| 11:01 | kados | pierrick_: with zebra, you have quite a bit of control over such behavior | 
| 11:02 | paul | (although I agree with hdl I don't understand why you consider this as wrong) | 
| 11:02 | kados | pierrick_: http://indexdata.dk/zebra/doc/data-model.tkl | 
| 11:02 | paul | to all : I think we should clearly separate UTF8 problem with mySQL & UTF8 with zebra. I think it's a mySQL one isn't it ? | 
| 11:02 | pierrick_ | paul: I've read more than twice MySQL documentation about mysql connection, maybe I missed something, gonna test one more time | 
| 11:02 | hdl | some of them are quite old and always type data in capital letter. | 
| 11:03 | paul | pierrick : maybe not, as unicode is a hard feature, for mySQL, as well as for Perl ! | 
| 11:03 | kados | paul: on my rel_2_2 box I have no utf-8 problems | 
| 11:03 | hdl | My pb is to clearly decode what is mysql pb and zebra. | 
| 11:03 | pierrick_ | hdl: yes, google also transforms "é" to "e" | 
| 11:04 | hdl | With accentuated letters, I encounter problems. | 
| 11:04 | Would it be with biblio or with borrowers. | |
| 11:05 | pierrick_ | paul: "select author from biblio where author like '%ère%'" should not return "Simone CAILLERE' | 
| 11:05 | if Koha makes a transformation, that's OK, but MySQL should see a difference | |
| 11:06 | hdl | Is that not depending on collation or such ? | 
| 11:06 | pierrick_ | hdl: I'm using default collation with latin1 character set, this is latin1_swedish_ci | 
| 11:07 | maybe I should test another collation, you're wright | |
| 11:07 | paul | _ci means "Case Insensitive". | 
| 11:07 | change it to "Case sensitive" and you should have a different result | |
| 11:08 | hdl | When playing with phpmyadmin I have NO display problems though. | 
| 11:08 | This is quite embarassing for me. | |
| 11:08 | paul | that's why I was thinking it was a perl problem. | 
| 11:09 | and thought the .utf8 in env was a good solution. | |
| 11:09 | hdl | either PERL or MySQL or perl with Mysql. | 
| 11:09 | pierrick_ | with latin1_general_cs, through phpmyadmin, it still returns Simone | 
| 11:10 | hdl | Mysql may return latin1 whatever collation you set it to.... I read sthg about that yesterday. | 
| 11:10 | pierrick_ | before installing phpmyadmin, I was using a perl script, with a "set names 'latin1'" before any query | 
| 11:11 | hdl | http://lists.mysql.com/perl/3779 | 
| 11:11 | pierrick_ | hdl: warning, there is character set for server, database, table, column and for the connection | 
| 11:11 | hdl | Read this post. | 
| 11:12 | About DBD::Mysql and utf8 | |
| 11:13 | pierrick_ | OK | 
| 11:14 | hdl | or try to googleize : DBD::mysql utf8 support | 
| 11:14 | kados | hdl: do you have some MARC records with accented chars you can send me (the data you are having trouble with)? | 
| 11:14 | I will attempt to get it working on my test box | |
| 11:14 | paul | just copy paste this : | 
| 11:15 | éàùÏ | |
| 11:15 | mmm... no. | |
| 11:15 | hdl | xml or iso2709 ? | 
| 11:15 | paul | they are not "true" utf8. | 
| 11:15 | NONE : | |
| 11:15 | kados | hmmm | 
| 11:15 | I would need iso2709 | |
| 11:15 | hdl | this channel is iso8859-1. | 
| 11:15 | paul | the problem we are speaking of is a MYSQL one for instance. | 
| 11:16 | the zebra one is another thing. | |
| 11:16 | kados | right ... so it's branch names, borrowers, etc. | 
| 11:16 | paul | yep. | 
| 11:16 | hdl | But it is linked. | 
| 11:16 | kados | in that case, I can just copy/paste from websites | 
| 11:16 | paul | yep | 
| 11:16 | kados | hdl: I'm ok | 
| 11:16 | hdl: no need to send data | |
| 11:16 | first I must repair my HEAD box :-) | |
| 11:17 | hdl | Since, when launching a rebuild_zebra.pl, you use MySQL data. | 
| 11:17 | So if there is a perl/MySQL problem, when importing to zebra, you will import problems ;) | |
| 11:19 | kados | hdl: http://kohatest.liblime.com/cg[…]admin/branches.pl | 
| 11:20 | hdl: this is the problem you're having? | |
| 11:20 | hdl | no trspassing :/ | 
| 11:21 | login/pass ? | |
| 11:21 | kados | kohaadmin | 
| 11:21 | Bo0k52R3aD | |
| 11:22 | hdl | Yes. | 
| 11:22 | kados | hdl: and your locale is set to utf-8? | 
| 11:22 | hdl | yes. | 
| 11:23 | My locale, my addDefaultCharset in Apache, my keybord. | |
| 11:23 | kados | hdl: what do you see when you type: locale ? | 
| 11:23 | I see mainly "en_US" | |
| 11:23 | on the kohatest machine | |
| 11:24 | hdl | kados ::http://pastebin.com/592599 | 
| 11:25 | paul | mmm... strange. my /etc/sysconfig/i18n contains : | 
| 11:25 | SYSFONTACM=iso15 | |
| 11:25 | LANGUAGE=fr_FR.UTF8:fr | |
| 11:25 | LC_ADDRESS=fr_FR.UTF8 | |
| 11:25 | LC_COLLATE=fr_FR.UTF8 | |
| 11:25 | LC_NAME=fr_FR.UTF8 | |
| 11:25 | LC_NUMERIC=fr_FR.UTF8 | |
| 11:25 | LC_MEASUREMENT=fr_FR.UTF8 | |
| 11:25 | LC_TIME=fr_FR.UTF8 | |
| 11:25 | LANG=fr_FR.UTF8 | |
| 11:25 | LC_IDENTIFICATION=fr_FR.UTF8 | |
| 11:25 | LC_MESSAGES=fr_FR.UTF8 | |
| 11:25 | LC_CTYPE=fr_FR.UTF8 | |
| 11:25 | LC_TELEPHONE=fr_FR.UTF8 | |
| 11:25 | LC_MONETARY=fr_FR.UTF8 | |
| 11:25 | LC_PAPER=fr_FR.UTF8 | |
| 11:25 | SYSFONT=lat0-16 | |
| 11:25 | BUT : | |
| 11:25 | a set gives me ! | |
| 11:26 | LANG=fr_FR | |
| 11:26 | LANGUAGE=fr_FR:fr | |
| 11:26 | LC_ADDRESS=fr_FR | |
| 11:26 | LC_COLLATE=fr_FR | |
| 11:26 | LC_CTYPE=fr_FR | |
| 11:26 | LC_IDENTIFICATION=fr_FR | |
| 11:26 | LC_MEASUREMENT=fr_FR | |
| 11:26 | LC_MESSAGES=fr_FR | |
| 11:26 | ... | |
| 11:26 | what am I doing wrong ? | |
| 11:26 | s/set/locale/ | |
| 11:28 | hdl | Did you restart your computer after your sysconfig modification ? | 
| 11:28 | paul | yep | 
| 11:29 | hdl | paul : i18n should be fr_FR.UTF-8 You certainly missed the hyphen (-) | 
| 11:30 | you are used to mysql and perl ;) | |
| 11:36 | paul | oups. no | 
| 11:37 | when I log as "paul" i'm not. | |
| 11:37 | when I su - i am. | |
| 11:37 | really strange... | |
| 11:37 | any idea someone ? | |
| 11:37 | kados | paul: check your .bash* files | 
| 11:38 | could be that your charset is specified there? | |
| 11:39 | paul: http://dev.mysql.com/doc/refma[…]n.html?ff=nopfpls | |
| 11:39 | paul | how could I see ? | 
| 11:39 | kados | is that what you've done as of now? | 
| 11:39 | paul: in your /home/paul dir | |
| 11:39 | there are several .bash* files | |
| 11:39 | paul | yes I know, but in .bashrc I don't see anything | 
| 11:40 | kados | in .bash_profile? | 
| 11:40 | paul | # Get the aliases and functions | 
| 11:40 | if [ -f ~/.bashrc ]; then | |
| 11:40 | . ~/.bashrc | |
| 11:40 | fi | |
| 11:40 | PATH=$PATH:$HOME/bin | |
| 11:40 | BASH_ENV=$HOME/.bashrc | |
| 11:40 | USERNAME="" | |
| 11:40 | export USERNAME BASH_ENV PATH | |
| 11:40 | xmodmap -e 'keycode 0x5B = comma' | |
| 11:40 | and that's all | |
| 11:40 | bashrc : | |
| 11:40 | alias rm='rm -i' | |
| 11:40 | alias mv='mv -i' | |
| 11:40 | kados | I'm not sure then :/ | 
| 11:40 | paul | alias cp='cp -i' | 
| 11:40 | [ -n $DISPLAY ] && { | |
| 11:40 | . /etc/profile.d/alias.sh | |
| 11:40 | } | |
| 11:41 | [ -z $INPUTRC ] && export INPUTRC=/etc/inputrc | |
| 11:41 | set $PATH=$PATH:/usr/local/kde/bin | |
| 11:41 | export PATH | |
| 11:41 | if [ -f /etc/bashrc ]; then | |
| 11:41 | . /etc/bashrc | |
| 11:41 | fi | |
| 11:41 | hdl | you say I am not... But is it you keyboard or your locale that is not . | 
| 11:41 | paul | my locale. | 
| 11:41 | locale told me fr_FR | |
| 11:41 | except after a su - that says fr_FR.UTF-8 | |
| 11:42 | kados | strange indeed | 
| 11:42 | hdl | Do you have loaded a keyboard or a system with your MCC ? | 
| 11:43 | or in KDE control ? | |
| 11:43 | paul | how can I check ? | 
| 11:43 | (you already told me but I don't remember) | |
| 11:44 | hdl | (If I told you, I don't remember ;) ) | 
| 11:44 | Configurer votre ordinateur et aller dans clavier. | |
| 11:46 | Test in console mode Ctrl Alt F2 and check if locale is the same. | |
| 11:47 | (Sh.....!!!) kados: I can't search my base again. | |
| 11:47 | kados | hehe | 
| 11:47 | hdl | (I rebuilded the stuff, once again today). | 
| 11:48 | paul | Configurer votre ordinateur et aller dans clavier. ==>>> I don't see what you mean | 
| 11:53 | hdl | paul : dans matériel/disposition du clavier. | 
| 11:53 | Il y a aussi les paramètres d'accessibilité KDE. | |
| 11:54 | paul | disposition du clavier => configuration du clavier tu veux dire ? | 
| 11:54 | et je choisis quoi ? | |
| 11:58 | thd | kados: I know what happened for some issues of multiple authorities created where there should have only been one. | 
| 11:58 | kados | thd: you found such instances? | 
| 11:58 | hdl | Je n'ai rien vu de très probant. | 
| 11:58 | thd | kados: look at William Faulkner. | 
| 11:59 | hdl | J'ai eu à le faire moi-même à la main. dans le fichier /etc/sysconfig/keyboard | 
| 11:59 | Normalement, français. | |
| 12:00 | thd | kados: Only half of those records used authority control for 100. | 
| 12:01 | kados: Some differences are because of the presence or absence of a full stop at the end of the field. | |
| 12:02 | kados | thd: I'm currently working on some utf-8 probs in 3.0 | 
| 12:02 | thd | kados: The values are not being normalised before the comparison is made. | 
| 12:02 | kados | thd: I hope to have some more time to work on authorities this afternoon | 
| 12:03 | thd | kados: What I do not know is why bib records show as 0 | 
| 12:03 | kados | thd: in the meantime, if you're finished with the MARC framework, you could start compiling a list of problems with the import :-) | 
| 12:03 | thd: so we can refine it :-) | |
| 12:03 | thd: I suspect that's a template prob ... I noticed it as well | |
| 12:04 | thd | kados: by import you mean authority building? | 
| 12:04 | kados | yep | 
| 12:06 | pierrick_ | As I suspected, in Koha 2.2 borrowers search, searching "anaë" returns "anaël" and "anaes". I might be smart, but it's false... | 
| 12:08 | kados: searching an accentuated name returns unaccentuated nams | |
| 12:09 | kados | and I suspect that's mysql being clever :-) | 
| 12:09 | you can probably turn off this feature if you don't like it | |
| 12:09 | you'd have to check the manual though | |
| 12:09 | paul | pierrick_ : who told you it was false ? | 
| 12:10 | because from a librarian point of view it's exactly what they want ! | |
| 12:10 | otherwise, a forgotten or wrong accent would make many data disappear ! | |
| 12:10 | kados | so here is what I have learned so far about utf-8 and perl | 
| 12:11 | paul | wait a little | 
| 12:11 | kados | earlier versions of perl (before 5.6) did not distinguish between a byte and a character | 
| 12:11 | paul: ok | |
| 12:12 | hdl | kados : can I try and make a summary of operations needed to get a zebra base. | 
| 12:12 | kados | please do | 
| 12:12 | hdl | 1) modify zebracfg according to your US one. | 
| 12:13 | And create tmp, shadow, lock directories. | |
| 12:14 | then zebraidx create Nameofyourzebrabase (define in the /etc/koha.conf) | |
| 12:14 | zebraidx commit | |
| 12:14 | paul : marche mieux ? | |
| 12:14 | paul | still not utf-8 when logged as paul | 
| 12:14 | :-( | |
| 12:15 | hdl | 3) zebrasrv localhost:2100/nameofyourbase | 
| 12:15 | pierrick_ | paul: it's false on a general point of view because it become impossible to search anaël and not anaes | 
| 12:16 | but it might be a MySQL cleverness. I thought it was a Koha feature | |
| 12:16 | paul | yes, but WHO want to search anaës and not anaes ? | 
| 12:16 | (in real life I mean) | |
| 12:17 | kados | hdl: looks correct | 
| 12:17 | hdl | 4) launch rebuild_zebra.pl -c (on an updated base) wait and wait and wait.... | 
| 12:17 | pierrick_ | we're are talking about a stupid simple example, imagine something like a chinese character^W ideogram search | 
| 12:17 | kados | pierrick_: the problem is, what if I want to search for that but don't have the correct keyboard? | 
| 12:17 | hdl | 5) zebraidx commit | 
| 12:17 | (commit all the stuff) | |
| 12:17 | kados | hdl: zebraidx commit should not be necessary | 
| 12:18 | pierrick_ | kados: this is the reason why many applications offers a table of characters | 
| 12:18 | kados | hdl: unless rebuild_zebra.pl does not use the correct subroutine in Biblio.pm to connect to z3950 extended services | 
| 12:18 | hdl | kados: Yet it seems to be. | 
| 12:18 | kados | hdl: it should use the z3950_extended_services() routine in the same way that bulkmarcimport.pl does | 
| 12:18 | hdl | unless you commited a fix very recently. | 
| 12:19 | pierrick_ | (but my "problem" is really not important at all, I admit) | 
| 12:19 | kados | hdl: i haven't had time to ... | 
| 12:19 | hdl | shadow get full and no .mf files... | 
| 12:20 | kados | maybe 4 gig isn't big enough? | 
| 12:20 | paul | OK, I have to stop working on utf-8 atm | 
| 12:20 | kados | paul: shall I explain what I have learned about utf-8? | 
| 12:20 | paul | for sure ! | 
| 12:20 | (i'm still here, but answering an RFP) | |
| 12:20 | kados | ok | 
| 12:20 | earlier versions of perl (before 5.6) did not distinguish between a byte and a character | |
| 12:21 | which is a major problem for unicode of course | |
| 12:21 | in perl 5.6 they wanted to: | |
| 12:21 | 1. not break old byte-based programs | |
| 12:22 | when they were using byte-based characters | |
| 12:22 | 2. allow byte-based programs to use character-based characters 'magically' | |
| 12:23 | to do this, perl uses bytes by default (at least in 5.6) | |
| 12:23 | to use character-based you must 'mark' the character-based interfaces so that perl knows to expect chracter-oriented data | |
| 12:24 | when they have been so marked, perl will convert all byte-based characters to utf-8 | |
| 12:26 | the bottom line is, we must explicitly tell perl we are working with utf-8 | |
| 12:26 | Sylvain | Hi all | 
| 12:27 | paul | another frenchy ! | 
| 12:27 | kados : you're right. | |
| 12:27 | kados | Input and Output Layers | 
| 12:27 | Perl knows when a filehandle uses Perl's internal Unicode encodings (UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with the ":utf8" | |
| 12:27 | layer. Other encodings can be converted to Perl's encoding on input or from Perl's encoding on output by use of the ":encoding(...)" layer. See | |
| 12:27 | open. | |
| 12:27 | paul | and the Encode:decode does this for variables. | 
| 12:27 | kados | yep | 
| 12:27 | I see no way around this | |
| 12:27 | paul | and DBD::mySQL returns a "non flagged" results. | 
| 12:28 | kados | so we must again set the flag | 
| 12:28 | paul | as, for Tümer everything is OK, I was suspecting he had something that could explain this. | 
| 12:28 | it can be the utf8 config of it's server (+ he's under windows) | |
| 12:28 | kados | are you sure he is testing with HEAD? | 
| 12:28 | paul | no. | 
| 12:28 | kados | because I could also say 'everything is OK' on my server | 
| 12:29 | paul | in fact I think he's working with 2.2 but I could be wrong. | 
| 12:29 | pierrick_ and Sylvain : introduce yourself | |
| 12:29 | kados | in that case, I suspect that as soon as he re-encodes mysql as utf-8 as in HEAD he will have the same problems we have | 
| 12:29 | paul | :-( | 
| 12:29 | could you ask him ? | |
| 12:29 | kados | sure | 
| 12:30 | paul | I had a solutions that seemed to work : | 
| 12:30 | I installed mysqlPP driver and hacked it a little. | |
| 12:30 | it's a Pure Perl mysql driver. | |
| 12:30 | it worked it seems. I just Encode everything coming from mysql socket | |
| 12:30 | kados | interesting | 
| 12:31 | paul: you will hate me to say this: what about switching to postgres? :-) | |
| 12:31 | paul | no answer from mysql ? | 
| 12:31 | kados | no answer from mysql | 
| 12:31 | paul | I won't | 
| 12:31 | I just will say : why not, but that's a huge task ! | |
| 12:31 | kados | does the DBD::Postgres driver have the same problems? | 
| 12:31 | paul | (+ complex for existing libraries) | 
| 12:31 | no. | |
| 12:32 | in fact the fix for mysql you can find on the net is a port from the fix for Postgres ! | |
| 12:32 | kados | paul: how many hours do you estimate it would take? | 
| 12:32 | (I have been working with postgres with Evergreen and I must say it is much nicer than mysql) | |
| 12:34 | (though much harder to use) | |
| 12:35 | paul: would switching to postgres be harder than putting in 'Encode' everywhere? | |
| 12:35 | paul: in your opinion? | |
| 12:35 | paul | yes because adding Encode is a boring but trivial task | 
| 12:35 | whereas switching to Postgres will make some problems with DB structure & management | |
| 12:36 | kados | ahh | 
| 12:36 | pierrick_: you have postgres experience, right? | |
| 12:36 | pierrick_: what is your opinion? | |
| 12:36 | morning owen | |
| 12:36 | owen | Hi | 
| 12:36 | kados | owen: the patron images stuff looks nice :-) | 
| 12:37 | hdl | Under Windows, Mysql and apache may be utf8 by default. | 
| 12:37 | kados | in case noone has seen it: | 
| 12:37 | http://koha.liblime.com/cgi-bi[…]dborrower=0054313 | |
| 12:37 | owen has created a very nice patronimages option in the NPL templates | |
| 12:38 | (of course, my pic is not avaiable :-)) | |
| 12:38 | paul | aren't you here joshua ? | 
| 12:38 | http://www.paulpoulain.com/pho[…]img_0061.jpg.html | |
| 12:38 | ;-) | |
| 12:39 | lol | |
| 12:39 | kados | hehe ... yes ... with long hair even :-) | 
| 12:39 | paul | (still long ?) | 
| 12:39 | kados | owen: hehe | 
| 12:39 | paul: no ... quite bald now :-) | |
| 12:39 | paul | bald ? | 
| 12:39 | hdl | Some information ar not well displayed (on the right of the picture | 
| 12:39 | kados | paul: I shaved my head with a razer about a week ago :-) | 
| 12:39 | paul | (same for me -konqueror-) | 
| 12:39 | wow ! | |
| 12:40 | kados | paul: but the long hair was cut some months ago ... about 6 months in fact | 
| 12:40 | paul: right before LibLime's first conference :-) | |
| 12:40 | paul | you're like a business man then now ? | 
| 12:40 | pierrick_ | (I'm back... sorry kados, you asked a question, I'm going to answer) | 
| 12:40 | kados | almost :-) | 
| 12:42 | hdl | When I tried to install Koha on a window box, data had to be utf-8. | 
| 12:42 | c< | |
| 12:43 | owen | kados: I think your liblime color stylesheet is missing some CSS relating to the patron image. That might be why it's getting overlaid by the patron details | 
| 12:45 | pierrick_ | So, my opinion about PostgreSQL ? | 
| 12:46 | If Koha uses MySQL InnoDB as table engine and utf8 as charset, I would say that it's worth switching to PostgreSQL | |
| 12:47 | my PostrgreSQL experience is quite old in fact, I was working on it in 2002 on a Java CMS | |
| 12:48 | my internship was about making the CMS talking to MySQL or Oracle or PostgreSQL. In unicode because the customer was asian | |
| 12:49 | and if I remember well, it was quite easy in fact | |
| 12:55 | but I can't believe we can make Koha work in full UTF-8 using same technnologies (Perl and MySQL) as in 2.2 | |
| 12:55 | kados | right | 
| 12:55 | the more I think about it the more I like the idea | |
| 12:56 | pierrick_ | sorry, I wanted to say "we can't make" | 
| 12:56 | kados | I think we need to proceed carefully though | 
| 12:56 | pierrick_: (i understood) | |
| 12:57 | pierrick_ | my front co-worker tells me PosgreSQL in UTF8 is not working very well under Windows | 
| 12:58 | kados | interesting | 
| 12:59 | this is a true dilemma then :-) | |
| 12:59 | paul | the last possibility being to stay with our 2x1 000 000 problem. | 
| 12:59 | pierrick_ | because PostgreSQL charset is based on system locale... and under Windows, you only have foo1252 | 
| 12:59 | paul | * keep mysql collate NOT in utf8 | 
| 12:59 | as in 2.2 ! | |
| 13:00 | to test : | |
| 13:00 | kados | but I think eventually we will need to fix the underlying problem | 
| 13:00 | paul | * get a 2.2 working | 
| 13:00 | * comment the utf8 move in updatedatabase | |
| 13:00 | * updatedatabase & see if it work | |
| 13:00 | kados | right | 
| 13:01 | pierrick_ | and change the HTML headers... | 
| 13:01 | paul | (that's already done in head pierrick_) | 
| 13:01 | (in PROG templates) | |
| 13:03 | kados | brb | 
| 13:03 | pierrick_ | paul: in PROG template, I read charset=utf-8 hardcoded | 
| 13:03 | paul | yep. | 
| 13:03 | that's what we want. | |
| 13:04 | it seems that utf8 works better if : | |
| 13:04 | * we keep mysql in iso | |
| 13:04 | pierrick_ | if your data are not stored in UTF-8, you'll have display problems | 
| 13:04 | paul | * we do nothin in Perl | 
| 13:04 | that's what I call the 2x1 million $ problem : | |
| 13:04 | pierrick_ | when do you convert from iso to utf8 for display ? | 
| 13:04 | paul | the result is 0, as expected. | 
| 13:04 | but hides 2 problems. | |
| 13:05 | I don't know. I just see that it work under 2.2 and for Tümer in Turkey ! | |
| 13:05 | it's dangerous to get something working through 2 things not working, but that's the only solution I see atm | |
| 13:06 | pierrick_ | "ça tombe en marche" (sorry kados, don't know how to translate) | 
| 13:06 | I hate not understanding :-/ | |
| 13:06 | paul | me too. that's why I tried to understand. | 
| 13:07 | exactly : "ca tombe en marche" | |
| 13:07 | pierrick_ | is there a mail on the mailing list explaining clearly the initial problem ? | 
| 13:08 | paul | no, there is a collection of mails. | 
| 13:08 | pierrick_ | I thought "set names 'UTF8';" was a the solution | 
| 13:08 | (once database correctly converted to utf8) | |
| 13:09 | paul | I thought too. | 
| 13:09 | that's why i added it to Auth.pm | |
| 13:11 | pierrick_ | Auth.pm means authorities ? Why not in Context.pm where the database connection is made ? | 
| 13:12 | paul | you're right. | 
| 13:13 | sorry | |
| 13:13 | (it was to see if you were following us ;-) ) | |
| 13:13 | pierrick_ | hehe | 
| 13:15 | from where do I re-read IRC and koha-devel to summarize the "MySQL, Perl and UTF-8 issue", I will summarize it on koha-devel | |
| 13:20 | kados | pierrick_: list archives are on savannah ... but google will find them better for you | 
| 13:20 | koha.org/irc is the irc log | |
| 13:21 | pierrick_ | s{from where}{since when}g | 
| 13:24 | hdl | pierrick_: (I sent you them via email) | 
| 13:24 | (filter on utf in devel list.) | |
| 13:26 | pierrick_ | thank you hdl :-) | 
| 14:01 | (just to say in the wind : how easy it was to convert and use UTF8 with Java/Oracle, that's what we made in my previous job... but making C talk with Oracle UTF8 was hard... and I hate Oracle anyway) | |
| 14:08 | kados | paul: are you still here? | 
| 14:08 | paul: do your clients use the 'issuingrules' aspect of Koha? | |
| 14:08 | paul | (on phone) | 
| 14:08 | kados | paul: it's been broken for several versions now | 
| 14:10 | hdl | Not for us. | 
| 14:10 | kados | The biggest bug is that filling in values in the * column doesn't work--it should set default values all patron types, but it doesn't. | 
| 14:11 | also, if values are left out, there are no hardcoded defaults | |
| 14:11 | issuing will just fail | |
| 14:12 | hdl: do your clients not experience this behavior? | |
| 14:12 | Sylvain | kados I've got problems with issuing rules and empty cells | 
| 14:12 | hdl | They fil in all the cells ;) | 
| 14:12 | Sylvain | generating null values in issuingrules table | 
| 14:12 | hdl | except fees. | 
| 14:13 | kados | so in my view, something is broken if it doesn't work the way it says it works :-) | 
| 14:13 | Sylvain | I agree that it doens't work :) | 
| 14:13 | kados | and issuingrules have been broken for several versions :-) | 
| 14:13 | Sylvain | I think I had done a patch, have do search | 
| 14:13 | kados | it is seemingly small problems like this that give us a bad name | 
| 14:13 | it makes Koha appear buggy | |
| 14:14 | Sylvain: that'd be great! | |
| 14:15 | hdl | Yes, but default value could also be a syspref, so that ppl could give it for once, and 21,5 is only an example. | 
| 14:15 | kados | hdl: is there a syspref for this? | 
| 14:16 | hdl | Not yet. But If Sylvain sends his patch, I could get it worl. | 
| 14:16 | s/worl/work | |
| 14:16 | And of course, if it is needed. | |
| 14:18 | Sylvain | in admin/issuingrules.pl there's a line with a # which is if ($maxissueqty > 0) | 
| 14:18 | I've rempalced it by : | |
| 14:18 | if (($issuelength ne '') and ($maxissueqty ne ''))^M | |
| 14:18 | { | |
| 14:18 | and for me it works | |
| 14:19 | kados | Sylvain: I'll try this | 
| 14:20 | Sylvain: what line? | |
| 14:20 | ahh ... nevermind | |
| 14:20 | Sylvain | but it's not tested a lot :) | 
| 14:20 | kados | Sylvain: I see two instances of this | 
| 14:21 | Sylvain: did you replace both? | |
| 14:21 | Sylvain | line 69 only | 
| 14:22 | but maybe the second one creates another problem | |
| 14:22 | as far as I remeber this change removed the pb with null values | |
| 14:26 | paul_away | see you on monday, for a new week of Koha hack ! | 
| 14:26 | pierrick_ | have a good long WE paul :-) | 
| 14:27 | paul_away | (i'm with a customer tomorrow. not WE !) | 
| 14:27 | (Ouest Provence to say everything...) | |
| 14:28 | pierrick_ | oh OK, enjoy your 50kms trip :-) | 
| 14:28 | kados | bye paul_away | 
| 14:28 | pierrick_ | kados: should I say "journey" or "trip"? | 
| 14:29 | kados | trip I think | 
| 14:29 | journey is a bit archaic | |
| 14:29 | pierrick_ | thanks | 
| 14:37 | hdl: I've finished reading the mails you forwar | |
| 14:37 | ded to me and associated web links | |
| 14:37 | Paul has already done a deep investigation | |
| 14:38 | I have to read IRC log in details to understand what remains problematic, but I'll do it tomorrow morning | |
| 14:38 | I'm going back home now, diner outside | |
| 15:05 | thd | owen: why does stylesheet have no file name in rel_2_2 ? | 
| 15:05 | owen | I'm not sure what you mean | 
| 15:06 | thd | <link rel="stylesheet" type="text/css" href="/opac-tmpl/npl/en/includes/" /> | 
| 15:06 | <style type="text/css"> | |
| 15:07 | @import url(/opac-tmpl/npl/en/includes/); | |
| 15:07 | </style> | |
| 15:07 | owen | There are new system preferences for defining those stylesheets | 
| 15:07 | thd | owen: oh, yes I looked for that but did not see them | 
| 15:07 | owen | We need to handle this better somehow in the case of new installations, I think | 
| 15:08 | thd | owen: what are the preferences called? | 
| 15:08 | owen | for the default, use opac.css for opaclayoutstylesheet | 
| 15:08 | and colors.css for opaccolorstylesheet | |
| 15:08 | thd | I see them | 
| 15:09 | owen: I had a mistaken pathname in my update script recently and had been missing changes which I have only just seen today | |
| 15:10 | owen: I assumed the changes were there but not working as I had expected. | |
| 15:10 | thank you owen | |
| 16:32 | owen | kados: you around? | 
| 16:39 | kados | owen: yea ... kinda | 
| 16:39 | what's up? | |
| 16:40 | owen | http://koha.liblime.com/cgi-bi[…]uest.pl?bib=18398 | 
| 16:40 | Is it even possible with Koha now to have more than one checkbox in that list of items? | |
| 16:40 | Is it still possible to have more than one itemtype attached to one biblio? | |
| 16:41 | kados | this is the tricky bit | 
| 16:41 | yes, strictly speaking, you can have more than one itemtype attached to a biblio | |
| 16:42 | however, this behavior isn't supported in the MARC21 version of KOha | |
| 16:42 | it is in the non-MARC and the UNIMARC | |
| 16:42 | I'm not sure why we got jipped | |
| 16:42 | it's one of those things on my list to check out | |
| 16:42 | so ... thanks for reminding me :-) | |
| 16:43 | owen | I'm just trying to figure out whether we still need the option to choose an item type when making a reserve (it's one of the things hidden in NPL's production template). So it's just us that can't use it. | 
| 16:43 | thd | kados: that is not supported on any MARC version of Koha | 
| 16:45 | owen: there is a workaround that requires a lot of work to setup but unfortunately I do not have time to explain it to you at the moment | |
| 16:45 | owen | No problem. I'm just trying to clean up the templates where possible (answer: not here) | 
| 16:45 | kados | it's a major flaw in the current design | 
| 16:46 | I'd leave it in for now ... hopefully we can fix it | |
| 16:46 | owen | Did you create a new syspref to show/hide the reading record? | 
| 16:47 | Oh... opacreadinghistory. | |
| 16:47 | Can that be used in the intranet too? | |
| 16:54 | I guess so | |
| 16:55 | kados | yep it can ... needs support in the templates, that's all | 
| 16:55 | but might make more sense to have a separate one for the intranet | |
| 16:55 | owen | So I can hide the link to the reading record page. Should I disable the display of reading history within the reading record page itself? | 
| 16:56 | i.e. "this page has been disabled by your administrator" | |
| 16:56 | kados | yea | 
| 17:30 | thd | kados: are you there? | 
| 17:30 | kados | thd: kinda | 
| 17:45 | thd | kados: now that I identified that problem I know there will be a need for a routine to translate the illegal ISO-8859 records people have into UTF-8 | 
| 17:46 | kados: That will be a little tricky because the leader will always claim the encoding is in another character set | |
| 17:48 | s/another/a legal/ | |
| 17:49 | kados | I didn't think that anyone would be so foolist | 
| 17:49 | foolish :-) | |
| 17:49 | as to create MARC in iso-8859 :) | |
| 17:51 | thd | kados: In fact all of paul's customers may have that problem | 
| 17:53 | kados: BNF can export records in the illegal ISO-8859-1 character set while the encoding still shows ISO-5426 | |
| 17:53 | kados | strange | 
| 17:53 | if you can't rely on the leader there's no way I can think of to auto-sense what charset you're working with | |
| 17:54 | thd | kados: that is a great help for systems that are UTF-8 challenged | 
| 17:54 | kados: However, it is an additional problem for migration to UTF-8 | |
| 17:56 | kados: The solution is not especially difficult even when the record encoding value does not match the actual encoding | |
| 17:56 | kados: I have seen code that tests for question marks and then is able to try to guess the encoding. | |
| 18:45 | kados: I should not have looked now but the source records were MARC-8 and they have been translated into UTF-8, although nothing changed the leader encoding for 000/09. | |
| 18:46 | kados | interesting | 
| 18:46 | who translated them? | |
| 18:46 | thd | kados: However my problem is probably that Perl refuses to send them to Apache as UTF-8 without changing my locale | 
| 18:46 | kados: Koha translated them | |
| 18:46 | kados | so you're doing original cataloging? | 
| 18:47 | it didn't change the leader to UTF-8? | |
| 18:47 | ahh ... what version of MARC::File::XML are you using? | |
| 18:47 | upgrade to 0.82 and test again | |
| 18:47 | thd | kados: No these are records captured with my test YAZ/PHP Z39.50 client | 
| 18:48 | kados | but you're manually copy/pasting them into the Koha editor? | 
| 18:48 | thd | kados: I can confirm that the contents of the data in MySql is correctly encoded in UTF-8 | 
| 18:49 | kados: No I used bulkmarcimport.pl | |
| 18:50 | kados: Everything should at least look fine except that Perl is telling Apache that I am sending ISO-8859 | |
| 18:51 | kados: Yet PHP does not care what my locale is for sending the data to Apache correctly | |
| 19:01 | kados | thd: you mean apache is telling perl that you're sending iso-8859 | 
| 19:01 | thd: I don't see how you can interact directly with perl on a browser | |
| 19:04 | thd | kados: I am probably not describing it correctly but PHP web applications works fine for UTF-8 on my system but Perl would seem to be the problem. | 
| 19:06 | kados: The problem could be specific to Koha, I will have to test later if I create a UTF-8 page in Perl outside of Koha to see whether that works correctly. | |
| 19:18 | kados: I strongly suspect Perl generally because there is a design issue that prevents it from working with Unicode as flexibly as more recently introduced or recently modified languages deriving from its origin without any thought to multi-byte character sets. That was one thing that Perl 6 is intended to remedy. | |
| 22:35 | kados: Koha is responsible for sending the characters in conformance to my locale encoding using some feature of Perl most likely. This is what I had proposed to develop as part of a configurable page serving feature. MARC::Charset would not be required for that. | |
| 22:43 | kados: the data in the XHTML is ISO-8859 but the data in MySQL is UTF-8. Apache cannot be responsible. Apache is in fact using UTF-8 encoding as directed but the data is ISO-8859. | |
| 23:35 | kados | thd: update your addbiblio.pl | 
| 23:35 | thd: I just committed a fix for your issue (I think) | |
| 23:35 | thd: also, if you're around, could you explain to me the different character sets that UNIMARC uses? | |
| 23:35 | thd | ok updating now | 
| 23:36 | kados: It is many but at least full unicode is defined | |
| 23:36 | kados | but I mean: what character sets (other than utf-8 and MARC-8) are UNIMARC records likely to be download in | 
| 23:37 | or uploaded into the reservoir as | |
| 23:37 | thd | kados: no MARC-8 in UNIMARC | 
| 23:37 | kados | just uft-8 then? | 
| 23:39 | thd | kados: paul's users as I had said have been obtaining records encoded in ISO-8859 that should have been ISO-5426 | 
| 23:40 | kados | that's really tricky | 
| 23:40 | anyway, did the fix work for you? | |
| 23:42 | thd | kados: I interrupted the update to bring you | 
| 23:43 | UNIMARC 100 $a a fixed field defining the character sets in a manner similar to 000/09 in MARC 21 | |
| 23:43 | $a/26-29 Character Sets (Mandatory) | |
| 23:43 | kados | just update the one file | 
| 23:43 | thd | Two two-character codes designating the principal graphic character sets used in communication of the record. Positions 26-27 designate the G0 set and positions 28-29 designate the Gl set. If a Gl set is not needed, positions 28-29 contain blanks. For further explanation of character coding see Appendix J. The following two-character codes are to be used. They will be augmented as required. | 
| 23:43 | 01 = ISO 646, IRV version (basic Latin set) | |
| 23:43 | 02 = ISO Registration # 37 (basic Cyrillic set) | |
| 23:43 | 03 = ISO 5426 (extended Latin set) | |
| 23:43 | 04 = ISO DIS 5427 (extended Cyrillic set) | |
| 23:43 | 05 = ISO 5428 (Greek set) | |
| 23:43 | 06 = ISO 6438 (African coded character set) | |
| 23:43 | 07 = ISO 10586 (Georgian set) | |
| 23:43 | 08 = ISO 8957 (Hebrew set) Table 1 | |
| 23:43 | 09 = ISO 8957 (Hebrew set) Table 2 | |
| 23:43 | 10 = [Reserved] | |
| 23:43 | 11 = ISO 5426-2 (Latin characters used in minor European languages and obsolete typography) | |
| 23:43 | 50 = ISO 10646 Level 3 (Unicode) | |
| 23:44 | kados | cvs update addbiblio.pl | 
| 23:44 | thd | kados: I know but I have to put it in the right place so this is faster | 
| 23:44 | kados | shit ... that's a lot of encodings | 
| 23:44 | I note that 8859's not on that list | |
| 23:45 | thd | kados: French users only have to worry about ASCII and ISO-5426 | 
| 23:45 | kados | well, so Koha's far from supporting UNIMARC | 
| 23:45 | at least in terms of encoding | |
| 23:45 | MARC::Charset only knows how to deal with MARC-8 and UTF-8 | |
| 23:45 | so UNIMARC's in trouble :-) | |
| 23:46 | thd | kados: there is MAB::Encode or whatever it is called for ISO-5426 | 
| 23:46 | kados | thd: not ascii according to the list you posted | 
| 23:46 | thd: ascii _is_ iso-8859 | |
| 23:47 | thd | kados ASCII is in there as an ISO standard | 
| 23:48 | kados: ISO 8859 has Latin characters past 128 which makes it more than ASCII | |
| 23:48 | kados | thd: well ... I guess ASCII is a subset of 8859 | 
| 23:48 | righ | |
| 23:48 | t | |
| 23:49 | thd | ASCII is ISO-646 | 
| 23:49 | kados | thd: where's MAB::Encode? | 
| 23:49 | thd | kados: CPAN | 
| 23:49 | kados | don't see it | 
| 23:49 | thd | kados: It is described as Alpha but I have never seen actual problem reports | 
| 23:50 | not that I really looked | |
| 23:50 | kados | ahh ... Encode::MAB | 
| 23:51 | thd | http://search.cpan.org/~andk/M[…]ib/Encode/MAB2.pm | 
| 23:52 | kados | thd: so I check character position 26-27 and 28-28 in UNIMARC, if 26-27 are set to '01' I should use Encode::MAB2? | 
| 23:52 | to fix the encoding? | |
| 23:53 | hmmm ... it can't go to utf-8 | |
| 23:53 | thd | kdaos: Except that BNF does not change that when sending records in ISO-8859-1 | 
| 23:53 | kados | yowser | 
| 23:53 | thd | kados: They can hardly set it to an undefined value | 
| 23:54 | kados | there's really nothing that can be done about that | 
| 23:54 | thd | kados: yes in fact there may be a check | 
| 23:54 | kados | we can't be expected to detect the record encoding | 
| 23:55 | thd | kados: I posted before about guessing the encoding and then checking to see if an error is produced after temporary parsing | 
| 23:55 | kados | heh | 
| 23:55 | thd | kados: I have seen routines that essentially search for question marks where they should not appear. | 
| 23:56 | kados | hehehe | 
| 23:56 | you're nuts :-) | |
| 23:56 | thd | kados: I assume those methods are not foolproof | 
| 23:57 | kados | well ... it might be a good customization | 
| 23:57 | for special cases | |
| 23:57 | I certainly wouldn't want to have that be default behavour | |
| 23:59 | thd | kados: I tried one of those methods in PHP just for fun but I was not feeding it good data | 
| 23:59 | kados | thd: did my fix work for you? | 
| 00:00 | thd | kados: not with existing data I am using bulmarcimport.pl -d right now | 
| 00:01 | kados: who has been partially normalising the searches? | |
| 00:01 | kados | ahh | 
| 00:01 | normalizing? | |
| 00:02 | so you're saying that rel_2_2 doesn't currently handle imports from bulkmarcimport ? | |
| 00:02 | well ... encoding that is? | |
| 00:02 | could you send me the records you're importing | |
| 00:02 | so i can fix it? | |
| 00:02 | thd: ? | |
| 00:03 | thd | kados: I found before records that I should not have found when searching Cezanne instead of C?zanne | 
| 00:03 | kados | that's mysql being smart | 
| 00:03 | is that your only problem? | |
| 00:04 | thd | kados: no that would be a benefit if I could find records by searching with C?zanne as well | 
| 00:05 | kados | thd: send me the records | 
| 00:05 | thd | kados: the correct diacritics failed even when I suppled the UTF-8 string | 
| 00:05 | kados | thd: I'll fix it :-) | 
| 00:06 | thd | kados: Let me get a better copy of the records they are full of redundancies from multiple targets making it difficult to search them for one | 
| 00:06 | kados | thd: but I don't have much time tonight, so you better make it quick | 
| 00:06 | thd | kados: I will be fast | 
| 00:07 | nothing has been fixed after re importing the records. | |
| 00:08 | kados | thd: I have a fix for you | 
| 00:08 | open bulkmarcimport.pl | |
| 00:08 | thd | kados: ok one moment | 
| 00:10 | open | |
| 00:10 | kados | thd: line 79 | 
| 00:10 | add the following after it: | |
| 00:10 | my $uxml = $record->as_xml; | |
| 00:10 | $record = MARC::Record::new_from_xml($uxml, 'UTF-8'); | |
| 00:10 | try that | |
| 00:11 | thd | while ( my $record = $batch->next() ) { | 
| 00:11 | kados | yep ... right after that line | 
| 00:11 | thd | that is line 79 | 
| 00:11 | kados | while ( my $record = $batch->next() ) { | 
| 00:11 | my $uxml = $record->as_xml; | |
| 00:11 | $record = MARC::Record::new_from_xml($uxml, 'UTF-8'); | |
| 00:11 | save the file | |
| 00:11 | and re-import | |
| 00:12 | see if that fixes it | |
| 00:13 | thd | reimporting now | 
| 00:16 | kados: that did something wired | |
| 00:16 | s/wired/weired/ | |
| 00:18 | kados: I have the accent on character over now C?zanne is now Ce\x{017a}anne | |
| 00:19 | kados: I think that did not go through right but it looks right with the acute accent except on the wrong character | |
| 00:20 | kados: My spell checker corrupted my post | |
| 00:20 | kados | heh | 
| 00:20 | so what'd it do? | |
| 00:20 | can I look at it? | |
| 00:23 | thd: ? | |
| 00:25 | thd: these are all marc-8 ... or at least claim to be | |
| 00:27 | http://opac.liblime.com/cgi-bi[…]tail.pl?bib=23783 | |
| 00:27 | I can search on CeÃÂ?zanne as well | |
| 00:37 | thd | kados: Ce\x{0301}zanne | 
| 00:37 | hehe spell checker corruption again | |
| 00:38 | kados: my YAZ/PHP client page is in UTF-8 now | |
| 00:39 | and was for the past few months | |
| 00:39 | kados: It saves the records in raw encoding which is MARC-8 for those 5 | |
| 00:40 | kados: what is with the wandering accent though | |
| 00:42 | kados | no idea | 
| 00:42 | it's really weird | |
| 00:42 | thd | kados: before your fix for bulkmarcimport.pl I had consistent ISO-8859 content in the Koha XHTML page with a UTF-8 header | 
| 00:42 | kados | it's also really weird that we can search for it :-) | 
| 00:43 | now what do you have? | |
| 00:43 | thd | kados: It would have looked fine if the header had been ISO-8859 | 
| 00:44 | kados | thd: I need some clarity | 
| 00:44 | thd: are these records MARC-8 or ISO-8859? | |
| 00:44 | thd | kados: now I have wandering accents which are clearly UTF-8 but Koha keeps changing the charcters depending on the type of view | 
| 00:44 | kados | thd: or don't you know? | 
| 00:44 | thd: really? | |
| 00:44 | thd: what's an example? | |
| 00:45 | thd | kados: the records were in MARC-8 before import into Koha | 
| 00:45 | kados: although I did not check each one | |
| 00:45 | kados | ok ... so they were converted to utf-8 | 
| 00:45 | and now they are in utf-8 | |
| 00:45 | are the leaders correct? | |
| 00:45 | thd | kados: yes UTF-8 strangeness | 
| 00:46 | kados | leaders are correct | 
| 00:46 | thd: check the MARC view | |
| 00:46 | thd | kados: leaders were not correct before I will check now | 
| 00:46 | kados | thd: the accent's in the right place | 
| 00:46 | http://opac.liblime.com/cgi-bi[…]tail.pl?bib=23783 | |
| 00:46 | 02131cam a2200409 a 4500 | |
| 00:46 | looks right to me | |
| 00:47 | also, in the MARC view the accent is in the right place! :-) | |
| 00:47 | wtf | |
| 00:49 | thd | kados: was your addBiblio.pm fix for the leader? | 
| 00:51 | kados | the leaders for records going in to koha should be automatically fixed now from bulkmarcimport.pl and addbiblio.pl | 
| 00:51 | thd | kados: did you see the mailing list post maybe on koha-devel about Koha UTF-8 code causing problem in Portuguese | 
| 00:51 | kados | (and I'm about to fix the leader 'length' setting too) | 
| 00:52 | recent? | |
| 00:52 | thd | kados: it was a few weeks ago | 
| 00:52 | kados | I missed it | 
| 00:52 | thd: let's focus on our issue | |
| 00:53 | thd: do you see that in the MARC view the accents are corect/ | |
| 00:53 | thd | kados: I did not pay close attention as it was just one record in Portuguese | 
| 00:53 | kados | ? | 
| 00:53 | correct even? | |
| 00:53 | thd | s/record/letter/ | 
| 00:53 | yes correct | |
| 00:53 | kados | I bet I know why | 
| 00:53 | the 'normal' view is pulling the records from the koha tables | |
| 00:54 | the marc view from the marc* tables | |
| 00:54 | thd | Oh and the tables are not UTF-8? | 
| 00:54 | kados | none of the tables are utf-8 | 
| 00:54 | :-) | |
| 00:54 | I | |
| 00:54 | thd | s/tables/original Koha tables/ | 
| 00:55 | kados | none of the tables in koha 2.2 are utf-8 | 
| 00:55 | thd | kados: my Koha MARC tables have been UTF-9 for months | 
| 00:55 | s/9/8/ | |
| 00:56 | kados | new theory | 
| 00:56 | the accent only shifts if the text is a link | |
| 00:57 | thd | kados: I know that the values were correct previously before your fixes in marc_subfield_table | 
| 00:57 | kados | thd: seem right to you? | 
| 00:57 | thd | kados: yes | 
| 00:57 | s/subfields/ | |
| 00:57 | s/subfield/subfields/ | |
| 00:58 | kados: that is where the MARC record data live | |
| 01:00 | MySQL does not know the difference between UTF-9 and ISO-8859 except in search indexing | |
| 01:00 | s/9/8/ | |
| 01:01 | kados | thd: so have we solved all your problems? | 
| 01:01 | except the strange fact that links shift the accents (which I bet is a browser problem) | |
| 01:02 | thd | kados: If we convert the original Koha tables all will be fine and happy. | 
| 01:02 | kados | convert? | 
| 01:02 | convert to what? | |
| 01:03 | thd | ALTER TABLE whatever to UTF-8 | 
| 01:03 | kados | no table in koha 2.2 is in utf-8 | 
| 01:03 | thd | then reimport | 
| 01:03 | kados | the marc tables aren't in utf-8 currently | 
| 01:03 | thd | kados: Some are on my system | 
| 01:03 | kados | do they work properly? | 
| 01:04 | thd | kados: In fact I rebuilt the current Koha DB with UTF-8 default | 
| 01:05 | kados: actually everything should have been UTF-8 except for update changes from CVS | |
| 01:06 | kados: certainly marc_subfields_table with the MARC data had been fine | |
| 01:06 | kados: i had taken my original rel_2_2 dump and changed all the encodings from ISO-8859 to UTF-8 | |
| 01:07 | kados: then I imported that into a database built with UTF-8 defaults | |
| 01:08 | kados | hmmm | 
| 01:08 | so what you're telling me is that mysql utf-8 works fine for you right? | |
| 01:08 | are all your tables utf-8? | |
| 01:08 | (how did you convert them?) | |
| 01:09 | thd | kados: then I dropped the original ISO-8859 database and had been very happy except that I had confused the CVS update path for a few weeks | 
| 01:10 | kados: so I could not see expected problems because MARC-8 data was still MARC-8 inside Koha until I fixed the CVS update path this morning | |
| 01:12 | kados | i don't understand why utf-8 works fine on my mysql since I haven't changed the tables to handle utf-8 :-) | 
| 01:13 | thd | kados: it was working fine for Carol Ku except for problems that I had supposed to be related to the previous lack of MARC-8 support | 
| 01:13 | kados: not knowing Chinese I could not tell | |
| 01:14 | kados | thd: if you have any MARC records in iso8859 (MARC21) I'd be interested in seeing what happens when they are imported under the new scheme | 
| 01:15 | thd | kados: those are illegal | 
| 01:15 | kados: what kind of criminal do you think I am :) | |
| 01:15 | kados | yea but they exist right? | 
| 01:15 | hehe | |
| 01:16 | thd | kados: well I suggested that they did earlier | 
| 01:16 | kados | anyway ... we digress | 
| 01:16 | thd | kados: I have seen Z39.50 servers claiming to have records in ISO-8859 | 
| 01:17 | for MARC 21 | |
| 01:18 | kados | so the last test I'd like to try tonight | 
| 01:20 | thd | kados: my Koha tables are UTF-8 but they have bad values | 
| 01:20 | kados | thd: could you tell me whether the new bulkimport.pl correctly inserts data into those tables? | 
| 01:21 | thd | kados: They do not even look like UTF-8 values | 
| 01:21 | kados | (ie, are the bad values from previous imports, or are they from current imports?) | 
| 01:22 | thd | kados: Do you mean bulkmarcimort.pl that we had fixed by your suggestion? | 
| 01:22 | kados | yes | 
| 01:22 | thd | kados: yes I reimpported with the delete option afterwords | 
| 01:23 | kados | good, so we are all on the same page | 
| 01:23 | none of us can import utf-8 correctly using utf-8 encoded tables | |
| 01:24 | thd | kados: My claim about ghost data even after the delete option was applied came from an old Koha MARC export that I had in the import directory but have subsequently deleted so it is no more. | 
| 01:24 | kados | ahh | 
| 01:24 | good news | |
| 01:24 | thd: what's your impression of the progress we've made between 2.2.5 and 2.2.6? | |
| 01:25 | thd | kados: huge number of bug fixes for show stopping bugs if only MARC-8 worked correctly | 
| 01:25 | kados | we don't need MARC-8 to work now that we convert everything to UTF-8 right? | 
| 01:26 | thd | kados: yes except we only need to track down the problem where the original Koha tables must now be getting improperly converted MARC-8 | 
| 01:27 | kados | thd: they aren't | 
| 01:27 | thd: it's just that firefox is formatting links strangely | |
| 01:27 | thd: when there are accented chars in them | |
| 01:27 | thd | kados: that is a Firefox bug? | 
| 01:27 | kados | thd: dunno | 
| 01:28 | thd | kados: I can see the data in the original Koha tables and it is wrong | 
| 01:28 | kados | er? | 
| 01:28 | what about the marc tables? | |
| 01:29 | thd | Kados: marc_subfields_table looks fine | 
| 01:30 | kados | the koha tables look fine to me: | 
| 01:30 | mysql> select * from biblio where biblionumber='23783'; | |
| 01:31 | | 23783 | NULL | CeÃÂ?zanne & Poussin : | NULL | NULL | NULL | NULL | 1993 | 20060309192428 | NULL | | |
| 01:31 | accent is on the e | |
| 01:31 | thd | kados: then the links are bad from a Firefox bug for you? | 
| 01:31 | kados | I think it's just a trick of the eyes | 
| 01:32 | for some reason, the font we're using makes it look like the accents are on the 'z' | |
| 01:32 | thd | kados: I do not see the accent on #koha | 
| 01:32 | kados | (yea, this channel is iso-8859) | 
| 01:32 | the font only makes it look that way when it's a link | |
| 01:32 | thd | kados: that was my first though but my eyes are better than that or do I need glasses for my perfect vision now? | 
| 01:32 | kados | it's right in the marc tables and it's right in the koha tables | 
| 01:33 | and it's right on the normal view for the heading and it's right in the marc view | |
| 01:33 | it's right everywhere but the links | |
| 01:33 | and that's a font or browser issue ... let's move on :-) | |
| 01:33 | thd | kados: code is corrupting the values as in the complaint about a Portuguese letter | 
| 01:34 | kados | where? | 
| 01:37 | I don't see any corruption | |
| 01:37 | thd | not on Google | 
| 01:37 | kados | hmmm | 
| 01:37 | thd | kados: I will see if I can produce a link for you from my system | 
| 01:39 | kados | thd: view source on the opac-detail page | 
| 01:41 | thd | kados: what do you see in the source? | 
| 01:42 | kados | properly accented characters | 
| 01:42 | thd | kados: your locale is utf-8? | 
| 01:43 | kados | thd: i changed the font on the results screen | 
| 01:44 | http://opac.liblime.com/cgi-bi[…]ha/opac-search.pl | |
| 01:44 | do a search on CeÃzanne | |
| 01:44 | like I said before, it's a font / browser issue | |
| 01:44 | can we close this topic once and for all? :-) | |
| 01:46 | thd | kados: I have no results from my search | 
| 01:46 | kados: now I will try searching with UTF-8 | |
| 01:46 | kados | that's what I wanted you to search with in the first place but I coudln't paste it in correctly :-) | 
| 01:47 | thd: are you convinced? | |
| 01:48 | thd | kados: maybe but was it not working previously with no accents? | 
| 01:49 | kados | what? | 
| 01:49 | you mean the search? | |
| 01:49 | thd | kados: when you searched on Liblime with Cezanne no accents did you not find records or was that just my system | 
| 01:49 | ? | |
| 01:50 | kados | dunno ... I never tried it | 
| 01:50 | does it still work on your system? | |
| 01:52 | thd | kados: that was only before you fixed it | 
| 01:52 | kados: we have to fix Firefox now :) | |
| 01:52 | kados | weird | 
| 01:53 | no we have to fix the Veranda font :-) | |
| 01:53 | that seemed to be the problem | |
| 01:53 | soon as i switched to sans-serif it worked fine | |
| 01:53 | thd | kados: so if I force firefox to display a different font it will be cured? | 
| 01:54 | kados | seems easier to just change the default font in Koha | 
| 01:54 | but yes, that should work too | |
| 01:54 | lets move on | |
| 01:54 | what's next | |
| 01:57 | thd | kados: that cured it for LibLime but not my system. I will try my own system again later | 
| 01:58 | kados: I forced a font change in Firefox itself | |
| 01:59 | kados: Carol had suspected her fonts and did not understand about UTF-8. I guess she was at least partly right. | |
| 01:59 | s/UTF/MARC/ | |
| 02:00 | kados: She could create records fine but could not import them. | |
| 02:01 | kados: we need a routine for detecting and converting the home user locale on quarry submission | |
| 02:02 | kados: and we also need query normalisation and index normalisation | |
| 02:02 | s/quarry/query/ | |
| 02:16 | kados: If I type C?zanne, I should find something from my ISO-8859 locale with query normalisation. I should also find something if I type Cezanne, even though the authority controlled values will always have C?zanne in UTF-8.. | |
| 02:19 | kados: Users of western European languages do not have UTF-8 locales on their home systems nor do many of your potential customers on their office systems. | |
| 02:22 | kados: Almost no Western European language user wants to send UTF-8 email because it will look like junk to most recipients. | |
| 02:23 | kados | good points | 
| 02:23 | but I don't think I can fix that tonight :-) | |
| 02:23 | thd | kados: no not tonight :) | 
| 02:23 | kados | in fact, I'm troubleshooting getting the z3950 daemon running on one of my servers | 
| 02:24 | I run it so rarely that I forget if I'm doing it correctly | |
| 02:25 | can't get it going | |
| 02:25 | very strange | |
| 02:25 | thd | kados: I wish mutt or SquirrelMail would allow reading mail in any encoding and sending in UTF-8 but that is an either or choice for the present. | 
| 02:26 | kados: do you need my Koha Z39.50 server hints message? | |
| 02:26 | kados | maybe | 
| 02:26 | thd | s/server/client/ | 
| 02:27 | kados | is it on kohadocs? | 
| 02:28 | thd | kados: No I was going to make it into a FAQ but very few users who had trouble had the patience to get the server running | 
| 02:28 | s/server/client/ | |
| 02:28 | kados | heh | 
| 02:28 | well if _I_ can't get it going ... :-) | |
| 02:30 | sam:/home/nbbc/koha/intranet/scripts/z3950daemon# ./z3950-daemon-launch.sh | |
| 02:30 | Koha directory is /home/nbbc/koha/intranet/scripts/z3950daemon | |
| 02:30 | No directory, logging in with HOME=/ | |
| 02:31 | what am I forgetting? | |
| 02:34 | thd: got that faq handy? | |
| 02:35 | thd | kados: you should have it now | 
| 02:39 | kados | hmmm ... I still cant' get it going | 
| 02:39 | thd | kados: did you receive my message? | 
| 02:39 | kados | yep | 
| 02:40 | thd | kados: It was mostly the starting point for identifying the problem | 
| 02:40 | kados | yea | 
| 02:41 | thd | kados: about 5 messages later most users would give up. | 
| 02:41 | kados: If they had continued then I would actually know hat to put in a proper FAQ | |
| 02:42 | kados | strangely, I have it running fine on another server | 
| 02:43 | and this server was working too | |
| 02:43 | thd | kados: what is different about the 2 servers? | 
| 02:43 | kados | looks like it just died about a week ago and noone noticed | 
| 02:44 | interestingly: | |
| 02:44 | # ./processz3950queue | |
| 02:44 | Bareword "Net::Z3950::RecordSyntax::USMARC" not allowed while "strict subs" in use at ./processz3950queue line 261. | |
| 02:44 | Bareword "Net::Z3950::RecordSyntax::UNIMARC" not allowed while "strict subs" in use at ./processz3950queue line 262. | |
| 02:44 | Execution of ./processz3950queue aborted due to compilation errors. | |
| 02:45 | ahh ... that's my problem | |
| 02:46 | Net::Z3950 isn't installed :-) | |
| 02:46 | just Net::Z3950::ZOOM | |
| 02:46 | which doesn't yet support the old syntax | |
| 02:46 | thd | :) | 
| 02:46 | kados | ok ... so I need to rewrite the z3950 client tomorrow :-) | 
| 02:46 | thanks for your help thd | |
| 02:47 | thd | kados: you need documentation to write a non-blocking asynchronous client | 
| 02:47 | your welcome kados | |
| 02:47 | good night | |
| 05:26 | hdl | hi | 
| 05:28 | http://www.ratiatum.com/news29[…]ans_le_calme.html | |
| 05:34 | chris | hmmm, ill have to babelfish that | 
| 05:35 | ohh DRM stuff | |
| 05:36 | hdl | yeah. | 
| 05:37 | Nationla chamber is voting this without objections nor listening counterparts. | |
| 05:37 | chris | ohh, that is bad news | 
| 05:39 | hdl | They retired a law article at the beginning of the vote. Then reintroduced it only to vote against and to vote amendments which promote THEIR vision. AWFUL ! | 
| 05:39 | chris | from what I can tell, it removes any rights to make a private copy ? | 
| 05:41 | is it kind of like the DMCA in the US? | |
| 05:41 | (babelfish doesnt do a very good job of translation, and the french i learnt when i was 12, i have all forgotten :-)) | |
| 06:38 | hdl | chris : In fact it is not kind of DMCA, it is WORSE :5 | 
| 07:59 | pierrick_: hi | |
| 07:59 | How are you ? | |
| 08:02 | pierrick_ | Hi hdl, I'm fine, how do you do ? | 
| 08:03 | hdl | quite good. | 
| 08:03 | pierrick_ | I'm writing an email to koha-dev telling what tests I've made about Perl/MySQL/UTF-8 | 
| 08:03 | hdl | Still on zebra import. | 
| 08:03 | Dos it work better ? | |
| 08:03 | Are you clear ? | |
| 08:04 | (that is : did you get the problem) | |
| 08:04 | pierrick_ | (it works so nice I don't understand why Paul has spend so much time on this issue) | 
| 08:04 | maybe I didn't finally understand what was not working | |
| 08:05 | hdl | Have you followed DADVSI ? | 
| 08:05 | pierrick_ | not at all, too much confusing for me | 
| 08:06 | and I don't really feel implicated since I only listen to radio... my concern is about free software, and software patents were rejected months ago | |
| 08:08 | hdl | but they are lurking around. | 
| 08:11 | And DADVSI is a matter of culture. French librarian are concerned about it. FSF france is concerned too, since it would make Open-source Software, which CANNOT have DRMs by construction, unless Sun makes his Open-Source DRM a success and a standard, on the fringe. | |
| 08:11 | This is why DADVSI matters for me. | |
| 08:12 | pierrick_ | I understand | 
| 08:17 | hdl | Anyway, thanks to your soon email, I will be able to work in UTF-8. :) | 
| 08:17 | "La vie est belle" ;) | |
| 08:17 | pierrick_ | maybe I miss the point, we'll see | 
| 08:18 | hdl | I shall tell you, if it would be the case. | 
| 08:31 | can you send a private copy to me... Will be faster : same mail as you, except henridamien | |
| 08:49 | pierrick_ | sorry, I was not reading IRC, mail was sent | 
| 10:13 | osmoze | hello | 
| 10:13 | hdl | pierrick_: Congratulations for set names feature ;) | 
| 10:27 | pierrick_ | hdl: no no, I didn't found anything more than what Paul found on usenet | 
| 10:27 | the only thing I made was to start from scratch with full UTF-8 from the beginning to the end | |
| 10:28 | hello osmoze | |
| 10:28 | hdl | But you did smart testing from scratch. | 
| 10:29 | pierrick_ | thank you :-) | 
| 10:29 | hdl | The problem is now to make good use of your results and of your remark on our bases and in Koha. | 
| 10:30 | Since at each connection, there is the charset problem. | 
← Previous day | Today | Next day → | Search | Index