Skip to content

How LSP Autocomplete Sorting Works: A Simple Fix for Better Suggestions

I forked a Python type checker just to fix autocomplete sorting. That’s how frustrated I was.

When I typed self. in VSCode, I expected self.__init__ or self.name to appear at the top. Instead, I got:

self.__class__ <- Rarely used
self.__delattr__ <- Never needed
self.__dict__ <- Sometimes useful
self.__dir__ <- What is this?
...
self.init <- Finally! (after scrolling)

The Problem

The Language Server Protocol (LSP) defines how editors communicate with language servers for features like autocomplete. But the protocol leaves sorting strategy almost entirely to the implementation.

Here’s what the LSP specification says about completion items:

lsp-completion-item.ts
interface CompletionItem {
label: string; // The label shown to the user
kind?: CompletionItemKind; // Method, function, variable, etc.
detail?: string; // Additional details
documentation?: string; // Full docs
sortText?: string; // Sort order override
filterText?: string; // Filter override
insertText?: string; // What gets inserted
// ... more optional fields
}

Notice sortText? It’s optional. And many language servers don’t set it meaningfully.

Why Sorting Feels Random

I dug into how LSP handles completion sorting and found three main issues:

1. Alphabetical by default

Most language servers fall back to alphabetical sorting when no sortText is provided. That’s why __class__ appears before init.

2. No usage tracking

LSP doesn’t track which completions you actually select. The protocol has no mechanism for learning from user behavior.

3. Static heuristics

Some servers use static heuristics based on item type (methods before variables, for example), but these don’t account for your specific usage patterns.

┌─────────────────────────────────────────────┐
│ LSP Sorting Options │
├─────────────────────────────────────────────┤
│ 1. sortText field (optional) │
│ 2. Alphabetical (default fallback) │
│ 3. Kind-based ranking (some servers) │
│ 4. Usage frequency (rarely implemented)│
└─────────────────────────────────────────────┘

A Reddit user captured this frustration:

“It feels quite absurd that basic intelligence like this is lacking from the majority of programs” - 40 points

The Solution: Hash Table Lookup

I created a simple solution: a hash table of commonly used prefixes.

prefix_ranking.py
# Common Python prefixes ranked by actual usage
COMMON_PREFIXES = {
'self.': {
'__init__': 100,
'name': 95,
'value': 90,
'data': 85,
'id': 80,
# ... more common attributes
},
'os.': {
'path': 100,
'environ': 95,
'getcwd': 90,
'listdir': 85,
'remove': 80,
# ... more common functions
},
'json.': {
'loads': 100,
'dumps': 95,
'load': 90,
'dump': 85,
},
'np.': { # NumPy shortcuts
'array': 100,
'zeros': 95,
'ones': 90,
'mean': 85,
},
}
def rank_completion(prefix: str, completion: str) -> int:
"""Return ranking score for a completion item."""
if prefix not in COMMON_PREFIXES:
return 0
return COMMON_PREFIXES[prefix].get(completion, 0)
def sort_completions(prefix: str, items: list[str]) -> list[str]:
"""Sort completions by usage frequency."""
# Sort by rank (descending), then alphabetically
return sorted(items, key=lambda x: (-rank_completion(prefix, x), x))

This approach has clear advantages:

┌─────────────────────────────────────────────┐
│ Hash Table vs AI-based Sorting │
├─────────────────────────────────────────────┤
│ Hash Table: │
│ - O(1) lookup time │
│ - Predictable behavior │
│ - Easy to debug and modify │
│ - No CPU overhead │
│ - Handles known patterns well │
├─────────────────────────────────────────────┤
│ AI-based: │
│ - Slower inference │
│ - CPU-intensive │
│ - Black box behavior │
│ - Better for unknown patterns │
│ - Higher maintenance cost │
└─────────────────────────────────────────────┘

How LSP Completion Actually Works

Let me show you how completion items flow through the LSP protocol:

┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ VSCode │────▶│ LSP Client │────▶│ Language │
│ (Editor) │ │ (Bridge) │ │ Server │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
│ textDocument/ │ │
│ completion │ │
│ ───────────────────────────────────────▶
│ │ │
│ │ CompletionItem[] │
│ ◀───────────────────────────────────────
│ │ │
│ Editor applies │ │
│ client-side │ │
│ sorting/filtering │ │
│ │ │

The client (VSCode) requests completions from the language server. The server returns a list of CompletionItem objects. Then the client applies its own sorting.

Implementing Better Sorting in a Language Server

Here’s how I modified a Python language server to use hash table ranking:

lsp_server.py
from dataclasses import dataclass
from typing import List
@dataclass
class CompletionItem:
label: str
kind: int
sort_text: str = ""
detail: str = ""
# Pre-computed prefix rankings
PREFIX_RANKS = {
('self', '__init__'): '001',
('self', 'name'): '002',
('os', 'path'): '001',
('json', 'loads'): '001',
}
def get_sort_text(prefix: str, label: str) -> str:
"""Generate sortText based on prefix and label."""
key = (prefix, label)
if key in PREFIX_RANKS:
return PREFIX_RANKS[key]
# Fall back to kind + label
return f'999{label}'
def build_completions(prefix: str, items: List[str], kinds: List[int]) -> List[CompletionItem]:
"""Build completion items with proper sortText."""
completions = []
for label, kind in zip(items, kinds):
completions.append(CompletionItem(
label=label,
kind=kind,
sort_text=get_sort_text(prefix, label)
))
return completions

The key insight: the sortText field determines display order. Lower values appear first.

VSCode Client-Side Configuration

You can also improve sorting on the VSCode side:

.vscode/settings.json
{
"editor.suggestSelection": "recentlyUsedByPrefix",
"editor.snippetSuggestions": "top",
"editor IntelliWidget.suggest.showMethods": true,
"editor.IntelliWidget.suggest.showFunctions": true
}

The recentlyUsedByPrefix option tells VSCode to remember which completions you select for each prefix context.

Why Hasn’t This Been Fixed?

I wondered why such an obvious problem persists. The answer is architectural:

1. LSP protocol limitations

The protocol doesn’t include usage frequency or learning capabilities. Each client (VSCode, Neovim, Emacs) implements its own ranking.

2. Server diversity

There are dozens of Python language servers (Pylance, Jedi, pyright, pylsp). Fixing sorting in one doesn’t fix it everywhere.

3. Resource constraints

Pylance already uses 4GB+ of RAM. Adding ML-based ranking would make it worse.

4. Specification inertia

LSP is an open standard. Adding new features requires coordination between Microsoft, language server authors, and editor developers.

The Ideal Solution

A combined approach would work best:

┌────────────────────────────────────────────────────────┐
│ Proposed LSP Sorting Stack │
├────────────────────────────────────────────────────────┤
│ Layer 1: Hash table (fast, predictable) │
│ - Common prefixes (self., os., json.) │
│ - O(1) lookup │
│ │
│ Layer 2: Context-aware heuristics │
│ - Type matching │
│ - Scope relevance │
│ - Recent edits in file │
│ │
│ Layer 3: Usage learning (client-side) │
│ - Track selection frequency │
│ - Personalize over time │
│ - No server overhead │
└────────────────────────────────────────────────────────┘

Language Server Protocol (LSP): A JSON-RPC based protocol that separates language features from editors. Editors become thin clients, while language servers provide intelligence like autocomplete, go-to-definition, and diagnostics.

CompletionItemKind: An enum in LSP that categorizes completions (text, method, function, constructor, field, variable, class, interface, module, property, etc.). Useful for filtering and visual icons.

sortText vs filterText: sortText controls display order. filterText controls what text is matched against user input. They serve different purposes but both affect autocomplete UX.


Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments