quartz/quartz/components at ec26ebcc9e53f67f6242266556ed13445e2f9688 - quartz - Gitea: Git with a cup of tea

smyalygames/quartz

Template

mirror of https://github.com/jackyzha0/quartz.git synced 2025-12-03 11:17:57 +01:00

Files

History

うろちょろ ec26ebcc9e

Build and Test / build-and-test (macos-latest) (push) Waiting to run

Details

Build and Test / build-and-test (windows-latest) (push) Waiting to run

Details

Build and Test / build-and-test (ubuntu-latest) (push) Has been skipped

Details

Build and Test / publish-tag (push) Has been skipped

Details

feat: improve search tokenization for CJK languages (#2231 )

* feat: improve search tokenization for CJK languages

Enhance the encoder function to properly tokenize CJK (Chinese, Japanese,
Korean) characters while maintaining English word tokenization. This fixes
search issues where CJK text was not searchable due to whitespace-only
splitting.

Changes:
- Tokenize CJK characters (Hiragana, Katakana, Kanji, Hangul) individually
- Preserve whitespace-based tokenization for non-CJK text
- Support mixed CJK/English content in search queries

This addresses the CJK search issues reported in #2109 where Japanese text
like "て以来" was not searchable because the encoder only split on whitespace.

Tested with Japanese, Chinese, and Korean content to verify character-level
tokenization works correctly while maintaining English search functionality.

* perf: optimize CJK search encoder with manual buffer tracking

Replace regex-based tokenization with index-based buffer management.
This improves performance by ~2.93x according to benchmark results.

- Use explicit buffer start/end indices instead of string concatenation
- Replace split(/\s+/) with direct whitespace code point checks
- Remove redundant filter() operations
- Add CJK Extension A support (U+20000-U+2A6DF)

Performance: ~878ms → ~300ms (100 iterations, mixed CJK/English text)

* test: add comprehensive unit tests for CJK search encoder

Add 21 unit tests covering:
- English word tokenization
- CJK character-level tokenization (Japanese, Korean, Chinese)
- Mixed CJK/English content
- Edge cases

All tests pass, confirming the encoder correctly handles CJK text.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>

2025-12-02 10:04:38 -08:00

..

fix: add proper popover hint to tag content page

2025-04-10 16:28:36 -07:00

feat: improve search tokenization for CJK languages (#2231 )

2025-12-02 10:04:38 -08:00

fix(css): not break word in the search button placeholder (#2182 )

2025-10-31 17:01:51 -07:00

ArticleTitle.tsx

chore(types): add additional hint for LSP support (#864 )

2024-02-13 23:53:44 -05:00

Backlinks.tsx

feat: support non-singleton explorer

2025-03-10 15:13:22 -07:00

Body.tsx

chore(types): add additional hint for LSP support (#864 )

2024-02-13 23:53:44 -05:00

Breadcrumbs.tsx

fix: cleanup a href link construction, global shared trie, breadcrumbs use trie

2025-03-23 17:24:43 -07:00

Comments.tsx

feat(giscus): expose language option for Comments component (#2012 )

2025-06-08 11:23:01 +02:00

ConditionalRender.tsx

feat: conditional render component

2025-03-23 17:34:14 -07:00

ContentMeta.tsx

fix: use time HTML element for date strings (#1622 )

2024-12-03 01:41:55 -05:00

Darkmode.tsx

feat: support non-singleton darkmode

2025-03-10 11:44:47 -07:00

Date.tsx

fix: use time HTML element for date strings (#1622 )

2024-12-03 01:41:55 -05:00

DesktopOnly.tsx

feat: flex component, document higher-order layout components

2025-03-11 14:56:43 -07:00

Explorer.tsx

fix(a11y): aria-controls and role fixes

2025-08-03 22:44:35 -07:00

Flex.tsx

fix(flex): respect DesktopOnly and MobileOnly components (#1971 )

2025-06-02 18:36:57 +02:00

Footer.tsx

feat(layout): add afterBody

2024-07-09 19:09:31 -07:00

Graph.tsx

fix(graph): make graph non-singleton, proper cleanup, fix radial

2025-03-10 11:39:08 -07:00

Head.tsx

feat(fonts): allow PageTitle to have its own font subset (#1848 )

2025-03-18 21:43:32 -07:00

Header.tsx

chore(types): add additional hint for LSP support (#864 )

2024-02-13 23:53:44 -05:00

index.ts

feat: reader mode

2025-04-17 19:45:17 -07:00

MobileOnly.tsx

feat: flex component, document higher-order layout components

2025-03-11 14:56:43 -07:00

OverflowList.tsx

fix(a11y): aria-controls and role fixes

2025-08-03 22:44:35 -07:00

PageList.tsx

fix(RecentNotes): Prevent folder pages from always appearing first (closes #1901 ) (#1904 )

2025-04-04 10:36:29 -07:00

PageTitle.tsx

feat(fonts): allow PageTitle to have its own font subset (#1848 )

2025-03-18 21:43:32 -07:00

ReaderMode.tsx

feat(i18n): readermode translations and icon (#1961 )

2025-05-07 21:56:18 +02:00

RecentNotes.tsx

feat: ability to hide tags in the recent notes component (#1147 )

2024-05-21 09:50:58 -07:00

renderPage.tsx

Prevent double-loading of afterDOMReady scripts (#2213 )

2025-11-27 14:51:56 -08:00

Search.tsx

fix(style): layout flow, search restyle

2025-09-17 15:26:49 -07:00

Spacer.tsx

fix(div): update class name to remove weird space afterwards (#763 )

2024-01-29 21:51:13 -08:00

TableOfContents.tsx

fix(a11y): aria-controls and role fixes

2025-08-03 22:44:35 -07:00

TagList.tsx

fix: cleanup a href link construction, global shared trie, breadcrumbs use trie

2025-03-23 17:24:43 -07:00

types.ts

feat: support non-singleton explorer

2025-03-10 15:13:22 -07:00