StringToolsStringTools

URL Parser

Break down any URL into its individual components — protocol, host, port, path, query parameters, and more.

Mitul MandankaFounder, Progragon Technolabs · 15+ years building software
Updated June 20267 min read
URL Input
Parse →
Parsed Components
Parsed URL components will appear here...

TL;DR

Paste a URL above and it splits into the seven parts defined by RFC 3986: scheme, userinfo, host, port, path, query, and fragment. Parsing runs entirely in your browser using the native URL constructor, so the same logic that powers location in every browser is what reads your input. The query string is decoded with URLSearchParams, so repeated keys like ?id=1&id=2 are kept as separate values rather than silently collapsed.

The anatomy of a URL, component by component

Every absolute URL is built from the same skeleton. The table below dissects one fully-loaded example so you can see exactly where each delimiter ends and the next component begins. The sample URL:

https://api:s3cr3t@shop.example.com:8443/v2/orders?status=open&status=paid#summary
ComponentDelimiterValue in the exampleSent to server?
Schemeends with :httpsn/a
Userinfo//@api:s3cr3tYes (as auth header)
Hostafter @shop.example.comYes (Host header)
Portafter :8443Connection target
Pathstarts with //v2/ordersYes
Querystarts with ?status=open&status=paidYes
Fragmentstarts with #summaryNo — client only

The fragment never leaves the browser. It is not in the HTTP request line, so server logs, analytics backends, and CDNs never see it — which is why single-page apps historically used #/route for client-side routing.

Percent-encoding: when a character must be escaped

A URL may only contain a restricted set of ASCII characters. Anything outside that set — spaces, most punctuation, non-ASCII letters — must be percent-encoded: replaced by a % followed by the byte's two-digit hex value (UTF-8 for multi-byte characters). The catch is that reserved characters have a structural meaning, so whether they need encoding depends on where they appear. The table below lists the reserved characters and their codes.

CharEncodedStructural roleEncode inside a value when…
space%20 (or + in query)Not allowed unencodedAlways
!%21Sub-delimiterRarely needed
#%23Starts the fragmentAlways — else value is truncated
$%24Sub-delimiterIn a query value
&%26Separates query pairsAlways in a query value
+%2BDecodes to space in queryAlways in a query value
,%2CSub-delimiterIn a path segment value
/%2FSeparates path segmentsWhen literal inside one segment
:%3ASplits host/port, schemeIn a value, to be safe
;%3BSub-delimiterIn a value
=%3DSplits key from valueAlways in a query value
?%3FStarts the query stringIn a path; safe inside query
@%40Ends userinfoIn a value (e.g. an email)
%%25Escape introducerAlways — else it's read as an escape

The unreserved set — A–Z a–z 0–9 - _ . ~ — is never encoded and never needs to be. Encoding it anyway is legal but pointless, and some servers compare paths byte-for-byte, so %2D and - can be treated as different.

encodeURI vs encodeURIComponent: the one that bites everyone

JavaScript ships two encoders, and picking the wrong one is the single most common URL bug. The rule: use encodeURI on a whole URL, and encodeURIComponent on a single piece you are about to drop into one.

  • encodeURI assumes its input is already a complete URL, so it leaves the structural characters : / ? # & = alone. It only escapes spaces and other illegal characters.
  • encodeURIComponent assumes its input is just data, so it escapes : / ? # & = too — exactly what you want when that data could itself contain a slash, ampersand, or equals sign.
const term = "rock & roll/50%";

// WRONG — the & ends the parameter early, server sees q=rock
"/search?q=" + encodeURI(term)
//   ->  /search?q=rock%20&%20roll/50%25

// RIGHT — the whole value is escaped as one opaque chunk
"/search?q=" + encodeURIComponent(term)
//   ->  /search?q=rock%20%26%20roll%2F50%25

In modern code you usually skip both: build the query with URLSearchParams and it encodes each value for you. Note one historical quirk — URLSearchParams encodes a space as +, while encodeURIComponent uses %20. Both decode back to a space.

How repeated query keys are handled (and why it varies)

The URL spec does not define what ?tag=red&tag=bluemeans — that is left to the server. Different stacks resolve the same string differently, which is a frequent source of "it works in Postman but not in prod" bugs:

Stack / APIReads ?tag=red&tag=blue asResult
JS URLSearchParams.getfirst value onlyred
JS URLSearchParams.getAllarray of all values[red, blue]
PHP $_GETlast value winsblue
Express (qs) req.queryarray[red, blue]
Python Flask request.args.getfirst value onlyred
Rails (Rack)last value wins (no [])blue

This parser shows every occurrence in order, so you can see the raw truth before your framework collapses it. When you genuinely need a list, prefer an explicit convention your backend documents — PHP and Rails expect tag[]=red&tag[]=blue.

Questions developers actually search for

What is the difference between a URL's host and hostname?

host includes the port when one is present (shop.example.com:8443), while hostname is just the domain (shop.example.com). If the URL uses the default port for its scheme — 443 for HTTPS, 80 for HTTP — the browser omits the port, so host and hostname come out identical.

Why does my query parameter value get cut off at a special character?

Because you put a raw reserved character into the value. An unencoded & starts a new parameter and a # starts the fragment, so ?q=a&b is read as two params and ?q=a#b drops the #b from the query entirely. Run each value through encodeURIComponent (or build it with URLSearchParams) before assembling the URL.

Is the part after the # sent to the server?

No. The fragment (everything after #) is stripped by the browser before the request is made — it never appears in the HTTP request line, server logs, or referrer headers. That is why OAuth implicit flows used to return tokens in the fragment, and why #access_token=... stays off the server. It is purely a client-side anchor or route.

When should I use encodeURI instead of encodeURIComponent?

Use encodeURI only when you have an entire, already-structured URL and just want to make illegal characters like spaces safe — it deliberately leaves : / ? # & = intact. Use encodeURIComponent for any single piece (a query value, a path segment) that might itself contain those characters. When in doubt for a single value, reach for encodeURIComponent.

How do I read a query parameter that appears more than once?

In JavaScript, params.get("tag") returns only the first value, while params.getAll("tag")returns every value as an array. Servers disagree: PHP keeps the last value, Flask keeps the first, and Express returns an array. Check the table above and confirm your specific backend's behavior rather than assuming.

Does this parser send my URL anywhere?

No. The site is a static export with no backend, and parsing uses the browser's built-in URL and URLSearchParams objects. Your URL — including any tokens or credentials embedded in it — never leaves your machine. You can confirm this in DevTools → Network: parsing triggers zero outbound requests.