Node.js Interview Prep
Streams and Buffers

Buffers

Working with Raw Binary Data in Node.js

LinkedIn Hook

"Why does Node.js have a Buffer class when JavaScript already has strings and arrays?"

Here's the dirty secret: before ES6, JavaScript had no way to represent raw binary data. No bytes. No octets. Just strings (which are UTF-16) and numbers (which are 64-bit floats). That's a disaster when you're building a server that has to read files, decrypt payloads, hash passwords, or push packets across a TCP socket.

Node.js shipped Buffer in 2009 to fix this — a fixed-size chunk of raw memory allocated outside the V8 heap, where the garbage collector cannot touch it. It is the lowest-level primitive in the entire Node runtime, and every single byte that flows through fs, net, http, crypto, and zlib passes through a Buffer first.

Most developers never think about it. They fs.readFile(path, 'utf8') and get a string back, blissfully unaware that under the hood Node allocated a Buffer, decoded the bytes, and threw the Buffer away. But the moment you have to handle a binary file, parse a custom protocol, or process an image — the abstraction leaks, and you'd better know how Buffer works.

In Lesson 4.1 I cover Buffer.alloc vs Buffer.allocUnsafe (and the security footgun nobody mentions), encodings, conversions, and when raw bytes hit your code.

Read the full lesson -> [link]

#NodeJS #BackendDevelopment #JavaScript #BinaryData #InterviewPrep


Buffers thumbnail


What You'll Learn

  • What a Buffer actually is — a fixed-size chunk of raw memory outside the V8 heap
  • Why Node.js needed Buffers before JavaScript had Uint8Array and ArrayBuffer
  • The three ways to create Buffers: Buffer.alloc, Buffer.allocUnsafe, and Buffer.from
  • The critical security difference between alloc and allocUnsafe
  • All the encodings Node supports (utf8, base64, hex, latin1, ascii) and when to use each
  • Converting between Buffers and strings safely
  • Concatenating, slicing, and comparing Buffers
  • How Buffer relates to ArrayBuffer and Uint8Array
  • Where you actually meet Buffers: file I/O, sockets, crypto, zlib

The Shipping Container Analogy

Imagine a freight company moving goods across the world. The trucks, ships, and trains do not understand "books," "shoes," or "electronics." They only understand shipping containers — standardized metal boxes of a fixed size. When you want to move a book, you put it in a container. When you want to move shoes, you put them in a container. The transport network only deals with the containers, never the contents.

A Buffer is the shipping container of Node.js. The operating system, network cards, disk controllers, and crypto libraries do not understand JavaScript strings, objects, or numbers. They only understand bytes — sequences of 8-bit unsigned integers from 0 to 255. So when you want to read a file, send a TCP packet, or hash a password, Node packs your data into a fixed-size container of bytes (a Buffer) and hands it to the operating system. When data comes back, it arrives in another container, and your job is to unpack it.

Once allocated, a container has a fixed size. You cannot "grow" it — you can only allocate a bigger one and copy. This is unlike a JavaScript Array, which silently resizes. The fixed-size constraint exists because the OS deals in fixed-size memory pages, and resizable containers would require copying on every write.

+---------------------------------------------------------------+
|              WHERE BUFFERS LIVE IN MEMORY                     |
+---------------------------------------------------------------+
|                                                                |
|   +---------------------------+    +----------------------+    |
|   |        V8 HEAP            |    |   OUTSIDE V8 HEAP    |    |
|   |   (managed by GC)         |    |  (manual / pooled)   |    |
|   |                           |    |                      |    |
|   |   - JS strings            |    |   - Buffer bytes     |    |
|   |   - JS objects            |    |   - ArrayBuffer data |    |
|   |   - Closures              |    |                      |    |
|   |   - Numbers               |    |   Allocated by Node  |    |
|   |                           |    |   via malloc / pool  |    |
|   |   Limited to ~1.5 GB      |    |                      |    |
|   |   by default              |    |   Not counted toward |    |
|   |                           |    |   --max-old-space    |    |
|   +---------------------------+    +----------------------+    |
|              ^                                ^                |
|              |                                |                |
|       JS variables hold              Buffer instance is        |
|       a small wrapper object         a tiny JS object that     |
|       on the heap that POINTS        references the external   |
|       to the external bytes ------>  byte storage              |
|                                                                |
+---------------------------------------------------------------+

This separation is the whole point. If Buffers lived on the V8 heap, every file read of 100 MB would compete with your application objects for the 1.5 GB heap budget, and the garbage collector would have to scan all those bytes on every cycle. By keeping bytes outside V8, Node can move gigabytes through the system without ever stressing the GC.


What a Buffer Actually Is

A Buffer is a subclass of Uint8Array. That single sentence answers half the interview questions you'll ever get about it.

// Every Buffer IS a Uint8Array
const buf = Buffer.from('hello');
console.log(buf instanceof Uint8Array);  // true
console.log(buf instanceof Buffer);      // true

// You can pass a Buffer to any API that wants a Uint8Array
const view = new DataView(buf.buffer, buf.byteOffset, buf.byteLength);
console.log(view.getUint8(0));           // 104 (the 'h' character)

A Buffer adds Node-specific helpers on top of Uint8Array: encoding-aware conversions (toString('utf8')), I/O-friendly methods (write, readInt32BE), pooling for small allocations, and the static factory methods we'll cover next.


Why Buffer Exists — A Brief History

When Node.js launched in 2009, JavaScript had no binary type at all. The language predated Uint8Array (added in ES6/2015) and ArrayBuffer. If you tried to read a PNG file with the only tool available — a string — you'd corrupt half the bytes because JavaScript strings are UTF-16 and many byte sequences are not valid UTF-16.

Ryan Dahl needed a way to represent raw bytes for HTTP, file I/O, and TCP sockets, so he added the Buffer class to Node's standard library. It became Node's de-facto binary type for years.

When ES6 standardized Uint8Array and ArrayBuffer, Node didn't drop Buffer — it rebased it as a subclass of Uint8Array. This kept all the existing Node code working while making Buffers fully compatible with the Web Platform's binary types. Today, Buffer is essentially "a Uint8Array with Node-specific batteries included."


Creating Buffers — The Three Constructors

There are three ways to make a Buffer in modern Node.js. The old new Buffer() constructor is deprecated and unsafe — never use it.

1. Buffer.alloc(size) — Safe, Zero-Filled

// Allocate 10 bytes, every byte initialized to 0
const safe = Buffer.alloc(10);
console.log(safe);  // <Buffer 00 00 00 00 00 00 00 00 00 00>

// Allocate and fill with a specific byte
const filled = Buffer.alloc(10, 0xff);
console.log(filled);  // <Buffer ff ff ff ff ff ff ff ff ff ff>

// Fill with a string pattern (repeats to fill the size)
const pattern = Buffer.alloc(10, 'ab');
console.log(pattern.toString());  // 'ababababab'

Buffer.alloc is always safe. Node guarantees every byte is zeroed before you see the buffer. This costs a small CPU hit (the OS has to write zeros), but it eliminates an entire class of memory-disclosure bugs.

2. Buffer.allocUnsafe(size) — Fast, Uninitialized

// Allocate 10 bytes WITHOUT zeroing them out
const unsafe = Buffer.allocUnsafe(10);
console.log(unsafe);  // <Buffer ?? ?? ?? ?? ?? ?? ?? ?? ?? ??>
                      // Contents are whatever was in that memory before!

// Standard pattern: allocate unsafe, then immediately overwrite
const buf = Buffer.allocUnsafe(4);
buf.writeUInt32BE(0xdeadbeef, 0);
console.log(buf);  // <Buffer de ad be ef>

Buffer.allocUnsafe skips the zero-fill step, so it's faster — but the returned buffer contains whatever garbage was in that region of memory before. That garbage might be old request bodies, old database rows, old session tokens, or fragments of someone else's password.

The security rule: only use allocUnsafe if you are about to immediately and completely overwrite every byte before the buffer is exposed to anything else. If you allocate 1024 bytes and only fill 500, those last 524 bytes leak old memory. If that buffer is then sent to a client, you have a textbook information-disclosure vulnerability — this is exactly the bug class behind Heartbleed.

3. Buffer.from(...) — Convert Existing Data

// From a string (default encoding is utf8)
const fromStr = Buffer.from('hello');
console.log(fromStr);  // <Buffer 68 65 6c 6c 6f>

// From a string with explicit encoding
const fromB64 = Buffer.from('aGVsbG8=', 'base64');
console.log(fromB64.toString());  // 'hello'

// From a hex string
const fromHex = Buffer.from('deadbeef', 'hex');
console.log(fromHex);  // <Buffer de ad be ef>

// From an array of byte values
const fromArr = Buffer.from([0x68, 0x65, 0x6c, 0x6c, 0x6f]);
console.log(fromArr.toString());  // 'hello'

// From another Buffer (creates a COPY, not a view)
const original = Buffer.from('hello');
const copy = Buffer.from(original);
copy[0] = 0x48;
console.log(original.toString());  // 'hello' (unchanged)
console.log(copy.toString());      // 'Hello'

Buffer.from is the only constructor for converting existing data. It is always safe — the source data dictates the contents.


Encodings — How Bytes Become Text

A Buffer holds raw bytes. To turn those bytes into a string (or vice versa), you must specify an encoding — a rule for mapping byte sequences to characters.

const buf = Buffer.from('Hello, World!', 'utf8');

// utf8 — variable-width Unicode (1 to 4 bytes per character)
console.log(buf.toString('utf8'));     // 'Hello, World!'

// hex — each byte becomes two hexadecimal characters
console.log(buf.toString('hex'));      // '48656c6c6f2c20576f726c6421'

// base64 — every 3 bytes become 4 ASCII characters (URL-safe transport)
console.log(buf.toString('base64'));   // 'SGVsbG8sIFdvcmxkIQ=='

// base64url — base64 with - and _ instead of + and /
console.log(buf.toString('base64url')); // 'SGVsbG8sIFdvcmxkIQ'

// latin1 — each byte is one character (0-255), lossless for raw bytes
console.log(buf.toString('latin1'));   // 'Hello, World!'

// ascii — 7-bit only, top bit stripped, lossy for non-ASCII
console.log(buf.toString('ascii'));    // 'Hello, World!'

When to use which:

  • utf8 — the default, and what you want for human-readable text. Variable width. The whole web speaks it.
  • hex — debugging, hashing output, low-level protocol dumps. Doubles the size.
  • base64 — embedding binary in JSON, URLs, email, or any text-only channel. ~33% larger than the raw bytes.
  • latin1 — a 1-to-1 byte-to-char mapping. Useful for "I want a string but I want every byte preserved." Never use it for actual Latin-1 text from the web.
  • ascii — almost never. The top bit is stripped, which silently corrupts non-ASCII bytes. Use utf8 instead.

Concatenating, Slicing, and Comparing

// Concat — joins multiple buffers into one
const a = Buffer.from('Hello, ');
const b = Buffer.from('World');
const c = Buffer.from('!');

// Pass an array of buffers; the second arg (total length) is optional but
// faster because it skips the length-calculation pass.
const joined = Buffer.concat([a, b, c], a.length + b.length + c.length);
console.log(joined.toString());  // 'Hello, World!'

// Slice / subarray — creates a VIEW into the same memory (no copy)
const view = joined.subarray(0, 5);
view[0] = 0x68;  // lowercase 'h'
console.log(joined.toString());  // 'hello, World!' (mutated!)

// To get an independent copy, use Buffer.from
const independent = Buffer.from(joined.subarray(0, 5));
independent[0] = 0x59;
console.log(joined.toString());      // 'hello, World!' (unchanged)
console.log(independent.toString()); // 'Yello'

// Compare — byte-by-byte equality
const x = Buffer.from('abc');
const y = Buffer.from('abc');
const z = Buffer.from('abd');
console.log(x.equals(y));   // true
console.log(x.equals(z));   // false
console.log(x.compare(z));  // -1 (x sorts before z)

// IMPORTANT: for cryptographic comparisons (HMACs, tokens), use
// crypto.timingSafeEqual to defeat timing attacks. Buffer.equals
// short-circuits on the first mismatched byte, which leaks length info.

A common gotcha: subarray (and the older alias slice) returns a view that shares memory with the original. Mutating the view mutates the original. If you need an independent copy, wrap it in Buffer.from.


A Real File-I/O Example

const fs = require('node:fs/promises');

async function readFirstBytes(path, n) {
  // Open the file and read the first n bytes into a fresh buffer.
  const handle = await fs.open(path, 'r');
  try {
    // alloc (not allocUnsafe) because we may not fill the whole buffer
    // if the file is smaller than n bytes.
    const buf = Buffer.alloc(n);
    const { bytesRead } = await handle.read(buf, 0, n, 0);

    // Slice down to the actual bytes read so we don't show trailing zeros.
    const actual = buf.subarray(0, bytesRead);

    // Print the magic number in hex — useful for sniffing file types.
    console.log('hex   :', actual.toString('hex'));
    console.log('ascii :', actual.toString('latin1'));

    // PNG magic: 89 50 4e 47, JPEG magic: ff d8 ff
    if (actual[0] === 0x89 && actual.toString('latin1', 1, 4) === 'PNG') {
      console.log('Detected: PNG image');
    }
  } finally {
    await handle.close();
  }
}

readFirstBytes('./photo.png', 8);

This is the bread-and-butter shape of every binary-handling routine in Node: allocate a buffer, ask the OS to fill it, slice down to the real length, then inspect or convert.


Buffer vs ArrayBuffer vs Uint8Array

These three names confuse everyone. Here is the precise relationship:

+---------------------------------------------------------------+
|                                                                |
|   ArrayBuffer                                                  |
|   ----------                                                   |
|   The raw block of memory itself. Just bytes. No methods       |
|   for reading or writing -- it's opaque storage.               |
|                                                                |
|         |                                                      |
|         | "viewed through"                                     |
|         v                                                      |
|                                                                |
|   Uint8Array (and Int16Array, Float32Array, DataView, ...)     |
|   ----------                                                   |
|   A typed window onto an ArrayBuffer that lets you read and    |
|   write specific element types. Standardized in ES6.           |
|                                                                |
|         |                                                      |
|         | "Node.js subclass"                                   |
|         v                                                      |
|                                                                |
|   Buffer                                                       |
|   ------                                                       |
|   A Uint8Array with extra Node helpers: encodings, pooling,    |
|   readInt32BE, writeUInt16LE, toString('hex'), etc.            |
|                                                                |
+---------------------------------------------------------------+
const buf = Buffer.from('hello');

// Buffer is-a Uint8Array is-a TypedArray
console.log(buf instanceof Buffer);      // true
console.log(buf instanceof Uint8Array);  // true

// .buffer is the underlying ArrayBuffer (often shared with a pool)
console.log(buf.buffer instanceof ArrayBuffer);  // true

// .byteOffset and .byteLength tell you where this Buffer's window
// sits inside the underlying ArrayBuffer (because of pooling).
console.log(buf.byteOffset);  // some pool offset
console.log(buf.byteLength);  // 5

The pooling detail is important. For small Buffers (under 4 KB), Node allocates them out of a shared internal pool — many small Buffers can share a single 8 KB ArrayBuffer. That means buf.buffer is not a private region; if you pass it directly to a Web API, you may expose unrelated bytes. Always pass buf.buffer, buf.byteOffset, buf.byteLength together when interoperating.


Where You Actually Meet Buffers

Even if you never type the word Buffer, your code is already using them everywhere:

  • fs.readFile(path) without an encoding returns a Buffer. Pass 'utf8' and Node converts it to a string for you.
  • net.Socket and tls.Socket emit 'data' events whose payload is a Buffer.
  • http.IncomingMessage emits chunks as Buffer instances unless you call setEncoding.
  • crypto.createHash('sha256').digest() returns a Buffer (or a hex string if you pass 'hex').
  • zlib.gzipSync(input) takes and returns Buffers.
  • JSON.parse(buf) silently coerces the buffer to a string first — fine for utf8, broken for any other encoding.

Knowing the layer you're at matters. If a library hands you a Buffer, you decide the encoding. If it hands you a string, the library has already decided for you — and if it guessed wrong, your data is corrupted before you ever see it.


Common Mistakes

1. Using Buffer.allocUnsafe and not fully overwriting it. Any byte you fail to write contains arbitrary memory from previous allocations. Sending such a buffer over the wire is a data-leak vulnerability. Default to Buffer.alloc. Only switch to allocUnsafe when profiling proves it matters and you can prove every byte is overwritten.

2. Calling new Buffer(50). The legacy constructor is deprecated since Node 6 and removed from the type definitions. Depending on the argument type it behaves like alloc, allocUnsafe, or from — which is exactly the kind of polymorphism that caused security CVEs. Use the explicit static methods.

3. Concatenating Buffers with +. bufA + bufB calls toString() on both Buffers, joins them as text using the default utf8 decoding, and returns a string — corrupting any non-text bytes. Use Buffer.concat([bufA, bufB]) instead.

4. Slicing a multi-byte UTF-8 character in half. A single emoji can take 4 bytes in UTF-8. If you split a buffer at byte 3 of a 4-byte character and then call toString('utf8'), you'll get the replacement character ?. For streaming text, use string_decoder.StringDecoder which buffers partial code points across chunks.

5. Using === or == to compare Buffers. Buffers are objects, so === checks reference identity. Use bufA.equals(bufB) for byte-equality, and crypto.timingSafeEqual(bufA, bufB) when comparing secrets.


Interview Questions

1. "What is a Buffer in Node.js, and why does it exist outside the V8 heap?"

A Buffer is a fixed-size chunk of raw memory that holds a sequence of bytes (8-bit unsigned integers). Technically it's a subclass of Uint8Array with Node-specific helpers for encodings, pooling, and integer reads/writes. It exists outside the V8 heap for two reasons. First, V8's heap is bounded (~1.5 GB by default) and managed by a garbage collector that has to scan everything inside it; if Buffers lived on the V8 heap, every large file read would compete with application objects and stress the GC. Second, the operating system's I/O syscalls work on raw memory addresses, and keeping the bytes in a stable, non-moving region outside V8 lets Node hand pointers directly to the kernel without copying. The wrapper Buffer object on the V8 heap is tiny — it just holds a pointer, an offset, and a length — while the actual byte storage lives in C++-managed memory.

2. "What is the difference between Buffer.alloc and Buffer.allocUnsafe? When would you use each?"

Buffer.alloc(size) allocates the requested number of bytes and zero-fills every byte before returning. It's safe — you cannot accidentally read leftover data. Buffer.allocUnsafe(size) allocates the bytes but skips the zero-fill, so the returned buffer contains whatever garbage was previously in that region of memory. That garbage could include old request bodies, session tokens, or other sensitive data from earlier in the process's lifetime. You should default to Buffer.alloc everywhere. The only time allocUnsafe is appropriate is in performance-critical hot paths where you can guarantee that every byte will be overwritten before the buffer is exposed — for example, immediately filling it from a fs.read call where you know the read will return exactly size bytes. The historical Heartbleed vulnerability is the canonical example of what happens when uninitialized memory leaks into an output buffer.

3. "What's the relationship between Buffer, Uint8Array, and ArrayBuffer?"

ArrayBuffer is the lowest layer — it's just a raw block of bytes with no methods for reading or writing. To access an ArrayBuffer's contents you create a typed view onto it, like Uint8Array, Int16Array, Float32Array, or DataView. Uint8Array is one such view that interprets the bytes as 8-bit unsigned integers. Buffer is a Node.js subclass of Uint8Array that adds helper methods like toString('utf8'), write, readInt32BE, and concat. This means every Buffer is also a Uint8Array, so you can pass a Buffer to any Web Platform API that expects a Uint8Array. The underlying ArrayBuffer is accessible via buf.buffer, but for small Buffers it's often shared with other Buffers via Node's internal pool, so you must always carry byteOffset and byteLength with it.

4. "How would you safely concatenate three Buffers, and what's wrong with using the + operator?"

You use Buffer.concat([buf1, buf2, buf3]), which optionally takes a total length as the second argument for a small performance boost. Using + is wrong because Buffers don't override the addition operator — JavaScript falls back to coercing both operands to strings via toString(), which decodes them as UTF-8 by default. Any non-text bytes (binary protocol headers, compressed data, image bytes) get destroyed by the UTF-8 decoder, replaced with the U+FFFD replacement character, and the result is a corrupted string instead of a binary buffer. Buffer.concat operates byte-by-byte with no encoding interpretation, preserving every bit.

5. "What encodings does Buffer support, and when would you use base64 vs hex vs utf8?"

Buffer supports utf8 (the default), utf16le, latin1, ascii, base64, base64url, and hex. Use utf8 for human-readable text — it's the universal encoding of the modern web. Use base64 when you need to embed binary data in a text-only transport like JSON, URLs, or email; it expands the size by about 33%. Use base64url when the result will appear in a URL or filename — it replaces + and / with - and _ so no escaping is needed. Use hex for debugging, cryptographic digests, and protocol dumps where readability matters more than space; it doubles the size. Use latin1 only when you specifically need a 1-to-1 byte-to-character round trip without any decoding. Avoid ascii — it strips the high bit of every byte, silently corrupting any non-ASCII data.


Quick Reference — Buffer Cheat Sheet

+---------------------------------------------------------------+
|                  BUFFER CHEAT SHEET                           |
+---------------------------------------------------------------+
|                                                                |
|  CREATE:                                                       |
|  Buffer.alloc(n)            // safe, zero-filled               |
|  Buffer.alloc(n, 0xff)      // safe, fill byte                 |
|  Buffer.allocUnsafe(n)      // fast, uninitialized memory!     |
|  Buffer.from('hello')       // from utf8 string                |
|  Buffer.from('SGVs', 'b64') // from base64 string              |
|  Buffer.from([0x48, 0x69])  // from array of bytes             |
|  Buffer.from(otherBuf)      // copy of another Buffer          |
|                                                                |
|  CONVERT:                                                      |
|  buf.toString('utf8')       // bytes -> text                   |
|  buf.toString('hex')        // bytes -> hex string             |
|  buf.toString('base64')     // bytes -> base64                 |
|                                                                |
|  COMBINE:                                                      |
|  Buffer.concat([a, b, c])           // join                    |
|  Buffer.concat([a, b], totalLen)    // faster with length      |
|  buf.subarray(start, end)           // VIEW (shared memory)    |
|  Buffer.from(buf.subarray(0, 5))    // independent copy        |
|                                                                |
|  COMPARE:                                                      |
|  bufA.equals(bufB)                  // byte equality           |
|  bufA.compare(bufB)                 // -1 / 0 / 1              |
|  crypto.timingSafeEqual(a, b)       // for secrets             |
|                                                                |
|  INSPECT:                                                      |
|  buf.length                 // bytes                           |
|  buf.byteOffset             // offset within shared pool       |
|  buf.buffer                 // underlying ArrayBuffer          |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|                  KEY RULES                                     |
+---------------------------------------------------------------+
|                                                                |
|  1. Default to Buffer.alloc -- safe is the right default.      |
|  2. allocUnsafe only if you fully overwrite every byte.        |
|  3. NEVER use new Buffer() -- deprecated and unsafe.            |
|  4. Use Buffer.concat, NEVER + to join Buffers.                |
|  5. subarray returns a VIEW; wrap in Buffer.from for a copy.   |
|  6. utf8 is the default and almost always the right choice.    |
|  7. Use crypto.timingSafeEqual for comparing secrets.          |
|  8. A Buffer is-a Uint8Array -- pass it anywhere one is wanted.|
|                                                                |
+---------------------------------------------------------------+
MethodSafe?SpeedUse When
Buffer.alloc(n)Yes (zero-filled)SlowerDefault choice
Buffer.allocUnsafe(n)No (raw memory)FastestHot path, fully overwriting
Buffer.from(string)YesFastConverting existing data
new Buffer(n)No (deprecated)--Never

Prev: Lesson 3.5 -- Crypto & Security Next: Lesson 4.2 -- Introduction to Streams


This is Lesson 4.1 of the Node.js Interview Prep Course -- 10 chapters, 42 lessons.

On this page