Web context

The @vertana/context-web package provides context sources that fetch and extract content from web pages. This is useful when translating documents that reference external articles or resources.

Installation

deno add jsr:@vertana/context-web

npm add @vertana/context-web

pnpm add @vertana/context-web

yarn add @vertana/context-web

bun add @vertana/context-web

Overview

This package provides three main context sources:

fetchWebPage: A passive context source that fetches a single URL on demand. The LLM can call this tool when it needs additional context.
fetchLinkedPages: A required context source factory that extracts all links from the source text and fetches their content before translation begins.
searchWeb: A passive context source that performs a web search and returns a list of results (title, URL, snippet).

fetchWebPage and fetchLinkedPages use Mozilla's Readability algorithm to extract the main article content from web pages, filtering out navigation, ads, and other noise.

`fetchWebPage`

A passive context source that the LLM can invoke when it needs to fetch a specific URL.

import { translate
 } from "@vertana/facade";
import { fetchWebPage
 } from "@vertana/context-web";

const text
 = `
This article discusses the concept explained at https://example.com/guide.
`;

const result
 = await translate
(model
, "ko", text
, {
  contextSources
: [fetchWebPage
],
});

When the LLM encounters a reference it wants to understand better, it can call the fetch-web-page tool with the URL to retrieve the page content.

`fetchLinkedPages`

A factory function that creates a required context source. It extracts all URLs from the source text and fetches their content before translation begins.

import { translate
 } from "@vertana/facade";
import { fetchLinkedPages
 } from "@vertana/context-web";

const text
 = `
Check out https://example.com/article for background.
Also see https://example.com/reference for more details.
`;

const result
 = await translate
(model
, "ko", text
, {
  contextSources
: [
    fetchLinkedPages
({
      text
,
      mediaType
: "text/plain",
    }),
  ],
});

Options

text: The source text to extract links from.
mediaType: The media type of the text ("text/plain", "text/markdown", or "text/html"). This affects how links are extracted.
maxLinks: Maximum number of links to fetch. Defaults to 10.
timeout: Timeout for each fetch request in milliseconds. Defaults to 10000.

`searchWeb`

A passive context source that the LLM can invoke when it needs to perform web search. This source only returns a list of results and does not fetch any result pages.

import { translate
 } from "@vertana/facade";
import { searchWeb
 } from "@vertana/context-web";

const text
 = "Please translate this and cite sources.";

const result
 = await translate
(model
, "ko", text
, {
  contextSources
: [searchWeb
],
});

When the LLM needs to find a relevant resource, it can call the search-web tool with a query to obtain a list of results (title, URL, snippet).

Options

query: The search query keyword(s).
maxResults: Maximum number of results to return. Defaults to 10.
region: DuckDuckGo region parameter (kl), e.g. "kr-kr" or "us-en".
timeRange: Time range filter (df): "d" (day), "w" (week), "m" (month), "y" (year).

Combining sources

For best results, use these sources together:

fetchLinkedPages provides context from links already present in the input.
searchWeb helps the LLM find relevant pages when the input has no links.
fetchWebPage lets the LLM fetch a specific result URL for more detail.

import { translate
 } from "@vertana/facade";
import { fetchLinkedPages
, fetchWebPage
, searchWeb
 } from "@vertana/context-web";

const text
 = `
Read the introduction at https://example.com/intro.
`;

const result
 = await translate
(model
, "ko", text
, {
  contextSources
: [
    // Pre-fetch all links in the text
    fetchLinkedPages
({ text
, mediaType
: "text/plain" }),
    // Allow LLM to search the web and fetch additional URLs on demand
    searchWeb
,
    fetchWebPage
,
  ],
});

extractLinks utility

The extractLinks function extracts URLs from text. It's used internally by fetchLinkedPages but is also exported for custom use cases.

import { extractLinks
 } from "@vertana/context-web";

// From plain text
const plainUrls
 = extractLinks
(
  "Check https://example.com for info.",
  "text/plain"
);
// => ["https://example.com"]

// From Markdown
const mdUrls
 = extractLinks
(
  "See [this article](https://example.com/article).",
  "text/markdown"
);
// => ["https://example.com/article"]

// From HTML
const htmlUrls
 = extractLinks
(
  '<a href="https://example.com">Link</a>',
  "text/html"
);
// => ["https://example.com"]

CLI usage

The Vertana CLI includes the -L or --fetch-links flag that enables web context fetching:

vertana translate -t ko -L document.md

This automatically:

Extracts all links from the input document
Fetches and extracts content from linked pages
Provides the content as context for translation

See the CLI reference for more details.

Web context ​

Installation ​

Overview ​

fetchWebPage ​

fetchLinkedPages ​

Options ​

searchWeb ​

Options ​

Combining sources ​

extractLinks utility ​

CLI usage ​

Web context

Installation

Overview

`fetchWebPage`

`fetchLinkedPages`

Options

`searchWeb`

Options

Combining sources

extractLinks utility

CLI usage