Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding.
Base64 encoding schemes are commonly used when there is a need to encode binary data that needs to be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remain intact without modification during transport. Base64 is commonly used in a number of applications including email via MIME, and storing complex data in XML.
In JavaScript there are two functions respectively for decoding and encoding base64 strings:
The atob()
function decodes a string of data which has been encoded using base-64 encoding. Conversely, the btoa()
function creates a base-64 encoded ASCII string from a "string" of binary data.
Both atob()
and btoa()
work on strings. If you want to work on ArrayBuffers
, please, read this paragraph.
Documentation
|
Tools
Related Topics |
The "Unicode Problem"
Since DOMString
s are 16-bit-encoded strings, in most browsers calling window.btoa
on a Unicode string will cause a Character Out Of Range
exception if a character exceeds the range of a 8-bit byte (0x00~0xFF). There are two possible methods to solve this problem:
- the first one is to escape the whole string (with UTF-8, see
encodeURIComponent
) and then encode it; - the second one is to convert the UTF-16
DOMString
to an UTF-8 array of characters and then encode it.
Here are the two possible methods.
Solution #1 – escaping the string before encoding it
function b64EncodeUnicode(str) { // first we use encodeURIComponent to get percent-encoded UTF-8, // then we convert the percent encodings into raw bytes which // can be fed into btoa. return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function toSolidBytes(match, p1) { return String.fromCharCode('0x' + p1); })); } b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU=" b64EncodeUnicode('\n'); // "Cg=="
To decode the Base64-encoded value back into a String:
function b64DecodeUnicode(str) { // Going backwards: from bytestream, to percent-encoding, to original string. return decodeURIComponent(atob(str).split('').map(function(c) { return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2); }).join('')); } b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode" b64DecodeUnicode('Cg=='); // "\n"
Unibabel implements common conversions using this strategy.
Solution #2 – rewrite the DOMs atob()
and btoa()
using JavaScript's TypedArray
s and UTF-8
Use a TextEncoder polyfill such as TextEncoding (also includes legacy windows, mac, and ISO encodings), TextEncoderLite, combined with a Buffer and a Base64 implementation such as base64-js.
When a native TextEncoder
implementation is not availale, the most light-weight solution would be to use TextEncoderLite with base64-js. Use the browser implementation when you can.
The following function implements such a strategy. It assumes base64-js imported as <script type="text/javascript" src="base64js.min.js"/>
. Note that TextEncoderLite only works with UTF-8.
function Base64Encode(str, encoding = 'utf-8') { var bytes = new (TextEncoder || TextEncoderLite)(encoding).encode(str); return base64js.fromByteArray(bytes); } function Base64Decode(str, encoding = 'utf-8') { var bytes = base64js.toByteArray(str) return new (TextDecoder || TextDecoderLite)(encoding).decode(bytes) }