About Character Encoding

Modified on Mon, Jun 16 at 10:02 AM


What is Character Encoding?


Character encoding is a system that assigns a number to each character in order to handle text on a computer.



Character Encodings Supported by TAO


TAO supports the following two types of character encodings:


(1) UTF-8

UTF-8 is one of the encoding methods that converts Unicode into a format that computers can read.

Unicode is a character encoding system that assigns a unique hexadecimal number to characters from around the world.


(2) Shift-JIS

Shift-JIS is a character encoding specifically designed for the Japanese language, and it supports fewer characters than UTF-8.



Character Encoding for Input and Output in TAO


Applicant Input and Data Storage


All data is stored in UTF-8 format.


When Exporting CSV Files


You can choose between UTF-8 and Shift-JIS.

If you export in UTF-8, all characters entered by the applicant will be exported as-is.


On the other hand, if you export in Shift-JIS, the data will be converted from UTF-8 to Shift-JIS.

In such cases, characters that cannot be represented in Shift-JIS will be automatically replaced with "an underscore (_) by the system to prevent character corruption.



About systems used to import CSV files


When importing a CSV file exported from TAO, it is assumed that you may use Excel or an internal university system.

If the character encoding setting of the importing system does not match the encoding of the CSV file, character corruption may occur.


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article