MS Access *.MDB Conversion to MySQL or SQLite, Problem in data Encoding

Greetings, i'll present my case:

  • I'm in linux UBUNTU
  • i have several Jet3 .MDB (MS Acess Database) about 500MB each, in which the datas encoded in cp1256/WINDOWS-1256
  • i have made the sqlite databases by following this article to do the conversion http://cltb.ojuba.org/en/articles/mdb2sqlite.

Here is the bash script that i made to convert the database. Assuming i have MS Access x.MDB

mdb-schema "x.mdb" | perl -wpe 's%^DROP TABLE %DROP TABLE IF EXISTS %i;
  s%(Memo/Hyperlink|DateTime( \(Short\))?)%TEXT%i;
  s%(Boolean|Byte|Byte|Numeric|Replication ID|(\w+ )?Integer)%INTEGER%i;
  s%(BINARY|OLE|Unknown ([0-9a-fx]+)?)%BLOB%i;
  s%\s*\(\d+\)\s*(,?[ \t]*)$%${1}%;' | sqlite3 > x.db 

for i in $(mdb-tables "x.mdb"); do echo $i; (
echo "BEGIN TRANSACTION;";
MDB_JET3_CHARSET="WINDOWS-1256" mdb-export -R ";\n" -I "x.mdb" $i;
echo "END TRANSACTION;" ) | sqlite3 "x.db"; done

I've tried to change the MDB_JET3_CHARSET to WINDOWS-1256, cp1256, WINDOWS-1251, cp1251, UTF-8. some produce different results in the data when i browse it, but still make no sense at all.

thanks before, and sorry for my bad English

Answers


Okay then after playing around many sites, i stumbled on this http://git.ojuba.org/cgit/thawab/tree/ and found a script that give me an idea (it's the bok2ki.py, if anyone is curious), I LOVE OPEN SOURCE!! :)

I add MDB_ICONV parameter with "UTF-8" as it's value, and change the MDB_JET3_CHARSET parameter value to "cp1256"

acctually i don't really know what those parameter really is, but i'm guessing MDB_JET3 CHARSET is to define the charset/encoding/codepages (i really don't know the difference, i should research more) and the MDB_ICONV is to define the encoding of target database. well those are just my assumption anyway.

then here is my new script:

mdb-schema "x.mdb" | perl -wpe 's%^DROP TABLE %DROP TABLE IF EXISTS %i;
  s%(Memo/Hyperlink|DateTime( \(Short\))?)%TEXT%i;
  s%(Boolean|Byte|Byte|Numeric|Replication ID|(\w+ )?Integer)%INTEGER%i;
  s%(BINARY|OLE|Unknown ([0-9a-fx]+)?)%BLOB%i;
  s%\s*\(\d+\)\s*(,?[ \t]*)$%${1}%;' | sqlite3 x.db 

for i in $(mdb-tables "x.mdb"); do echo $i; (
echo "BEGIN TRANSACTION;";
MDB_JET3_CHARSET="cp1256" MDB_ICONV="UTF-8" mdb-export -R ";\n" -I "x.mdb" $i;
echo "END TRANSACTION;" ) | sqlite3 "x.db"; done

Need Your Help

How can I get the number of pages in a PDF file in Perl?

perl pdf pdflib

Is there any Perl script to read multiple PDF files and get the number of pages in it?

PHP + jQuery/AJAX form submit and loading result

php javascript jquery mysql ajax

I have this form with multiple radio buttons. Exactly 3 inputs per person. I am able to submit and store it to database with ajax and no problems there. My problem is really getting correct info back