utf 8 - using PHP explode() of a unicode string to get the rows in an array -
i trying read tab delimited spreadsheet unicode characters this:
$content = file_get_contents($filename);
when print in browser texts shown correctly. there header:
header('content-type: text/html; charset=utf-8');
now want split content rows using:
$rows= explode("\n",$content);
the content unicode characters gibberish when instance print 1 row:
echo $rows[1];
my question is: causing behaviour , can correct texts $row array? in end want insert row values database, inserts gibberish.
help appreciated
example
a row before explode() looks (note: tabs not displayed below):
r002 Студия 2В 66 Богдан дорога Санкт-Петербург 3174 45 Андрей Смирнов маркетинг 234-56790 653-23685 dummy@dummy.com 34354547
after explode row looks like:
r002 ! b c 4 8 o 2 66 > 3 4 0 = 4 > @ > 3 0 ! 0 = : b -¬ 5 b 5 @ 1 c @ 3 3174 45 = 4 @ 5 9 ! < 8 @ = > 2 < 0 @ : 5 b 8 = 3 234-56790 653-23685 dummy@dummy.com 34354547 59
edit: substring not working
i noted strange behavious. when
echo mb_substr($content,0,50,'utf-8');
the output 25 characters, characters displayed correctly
r002 Студия 2В 66 Богдан
however when change offset form 0 instance 5 it's mess again.
echo mb_substr($content,5,50,'utf-8');
the output
02 ! b c 4 8 o 2 66 > 3 4 0 = 4 >
not sure what's going on here ... can because file contains utf-8 bom ("\xef\xbb\xbf")?
i found solution, had to it's encoding. exported excel offered initial difficulties. anyways here code resolve encoding bit:
$data = file_get_contents($filename); if (strpos($data, "\xef\xbb\xbf") !== false) { //do nothing, it's utf-8 } elseif(strpos($data, "\xff\xfe") !== false) { $data = iconv('ucs-2', 'utf-8', $data); //le utf-16 } elseif(strpos($data, "\xfe\xff") !== false) { $data = iconv('ucs-2', 'utf-8', $data); //be utf-16 }
Comments
Post a Comment