3439 Commits

Author SHA1 Message Date
Chunlei Niu
ba80d53cf2 Qualcomm options parsing should not be inside GPU options check.
LiteRT-PiperOrigin-RevId: 852871115
2026-01-06 11:43:44 -08:00
Google AI Edge
ffffda2740 Update compiled model to always load the external buffer into host memory
Load external weights directly into GPU memory requires prepacking

LiteRT-PiperOrigin-RevId: 852812464
2026-01-06 09:29:10 -08:00
Steven Toribio
825d7d5a22 Add tfl.Pack in model utils.
LiteRT-PiperOrigin-RevId: 852792641
2026-01-06 08:36:08 -08:00
Google AI Edge
11220b2e0a Re-enable options testing with LITERT_WITH_EXTERNAL_WEIGHT_LOADER
LiteRT-PiperOrigin-RevId: 852625788
2026-01-05 23:30:16 -08:00
Google AI Edge
4139e62ec1 Automated Code Change
LiteRT-Converter-PiperOrigin-RevId: 852558553
2026-01-05 19:54:06 -08:00
Byungchul Kim
5590ba3ff0 Add GPU option to disable Vulkan shader compilation optimization.
LiteRT-PiperOrigin-RevId: 852504883
2026-01-05 16:41:11 -08:00
Jae H. Yoo
1145879bd1 Add QI2 to tfl.embedding_lookup
PiperOrigin-RevId: 852473501
2026-01-05 15:16:09 -08:00
Jae H. Yoo
59d7d3ad1e Add QI2 to tfl.embedding_lookup
LiteRT-Converter-PiperOrigin-RevId: 852473501
2026-01-05 15:11:10 -08:00
Majid Dadashi
f30d0a94f0 Add support for kTfLiteInt2 in TFLite FullyConnected hybrid kernels.
This change enables hybrid quantization for FullyConnected operations where the filter weights are in kTfLiteInt2 format. The kernel now unpacks the packed 2-bit integers into int8 for computation.

PiperOrigin-RevId: 852438439
2026-01-05 13:44:21 -08:00
Google AI Edge
4db33a8daf Migrates builder.create<Op>() => Op::create()
LiteRT-Converter-PiperOrigin-RevId: 852436299
2026-01-05 13:38:29 -08:00
Terry Heo
43406c82c8 Update LiteRT C++ Options API doc in Doxygen style
LiteRT-PiperOrigin-RevId: 852426900
2026-01-05 13:13:54 -08:00
Chunlei Niu
e91a213cf5 Add error handling for TensorBufferScopedLock creation in JNI.
LiteRT-PiperOrigin-RevId: 852419640
2026-01-05 12:55:35 -08:00
Ping Yu
b28179d6cb Integrate ProfileSummarizer into LiteRtProfiler.
LiteRT-PiperOrigin-RevId: 852383063
2026-01-05 11:23:30 -08:00
Google AI Edge
66dee2d929 Update build visibility for internal testing.
LiteRT-PiperOrigin-RevId: 852360292
2026-01-05 10:34:43 -08:00
Andrew Zhang
197d6ca925 Add python api to get tensor details from a signature.
LiteRT-PiperOrigin-RevId: 852353926
2026-01-05 10:20:01 -08:00
Tommy Chiang
c5d473d92e Internal change
LiteRT-PiperOrigin-RevId: 852337393
2026-01-05 09:41:47 -08:00
Google AI Edge
e69612b8af Initialize xnn before checking fingerprint.
PiperOrigin-RevId: 852287251
2026-01-05 07:10:38 -08:00
Terry Heo
f43eff4998 Update LiteRT C++ API comments
Moved a file comment to a class description.
Added missing comments.

LiteRT-PiperOrigin-RevId: 852031899
2026-01-04 14:13:45 -08:00
Google AI Edge
5537b9230d thrInvocationContextGetOutputBufferSyncFence now supports generating output fence for any edge in graph that is annotated with "request_fence".
LiteRT-PiperOrigin-RevId: 851429764
2026-01-02 13:24:12 -08:00
Alexander Shaposhnikov
95009b84b8 Bump XNNPACK version for open source builds.
PiperOrigin-RevId: 850573818
2025-12-30 19:10:38 -08:00
Byungchul Kim
5ef8b3c102 Internal changes only
LiteRT-PiperOrigin-RevId: 850239567
2025-12-29 20:12:24 -08:00
Google AI Edge
e2fb1c0fe0 Fix cmake host build error due to gcc warnings being treated as errors
LiteRT-PiperOrigin-RevId: 850147345
2025-12-29 13:35:33 -08:00
Copybara-Service
44713aa110 Merge pull request #5019 from fujunwei:level_zero_buffer_as_custom_buffer
LiteRT-PiperOrigin-RevId: 850137829
2025-12-29 12:59:14 -08:00
Alexander Shaposhnikov
a7d9560258 Remove spurious explicit and invalid constexpr.
LiteRT-PiperOrigin-RevId: 849322469
2025-12-26 18:59:05 -08:00
Google AI Edge
aaacec2896 This is an internal change
LiteRT-PiperOrigin-RevId: 849279059
2025-12-26 15:16:13 -08:00
Google AI Edge
27e34cd605 This is an internal change
LiteRT-PiperOrigin-RevId: 849278986
2025-12-26 15:13:07 -08:00
Google AI Edge
d3952c78b4 Properly handle has_builtin macro on certain gcc compiler
LiteRT-PiperOrigin-RevId: 849268648
2025-12-26 14:28:14 -08:00
Google AI Edge
096ef5912f Use GitHub mirror for BCR to work around bcr.bazel.build outages
LiteRT-PiperOrigin-RevId: 849267935
2025-12-26 14:24:49 -08:00
fujunwei
bc7fcbc9ea Use OpenVINO level zero remote tensor as LiteRT custom buffer
The performance of Gemma3 improved from 0.3 tokens/s to 71 tokens/s on
PTL.
2025-12-24 07:49:22 +08:00
Google AI Edge
32ec63ee89 This is an internal change
Reverts 7771eaecff29780eb0069082c2f6ac1d27a3af1e

PiperOrigin-RevId: 848262323
2025-12-23 12:27:21 -08:00
Copybara-Service
20c00d28b0 Merge pull request #5025 from graham0824:dev/chunhsue/add_missing_cmake_dep
LiteRT-PiperOrigin-RevId: 848227491
2025-12-23 10:42:09 -08:00
Google AI Edge
6b829c3919 Automated Code Change
PiperOrigin-RevId: 848073971
2025-12-23 01:32:12 -08:00
Chun-nien Chan
9faa4eae4a Add tfl quant propagation passes to model-utils
LiteRT-PiperOrigin-RevId: 848049860
2025-12-23 00:09:47 -08:00
Chun-nien Chan
9aad7a4de5 model-utils add extenral_const op
LiteRT-PiperOrigin-RevId: 847981237
2025-12-22 20:11:34 -08:00
Google AI Edge
7771eaecff This is an internal change
Reverts bb01f1a9736d45996115e28825e702858df83fc0

PiperOrigin-RevId: 847933822
2025-12-22 17:21:43 -08:00
Byungchul Kim
51b3f46565 Set FC's keep_num_dims to false when output dims is different from input dims after quantization.
On gemma3n with decode batch > 1, it happens when the embedding is coupled with PLE by einsum.
The export steps are:
1) Initial: BMM([b,2048]x[2048,7680] -> [b,7680])
2) FuseInputReshape_BatchMatMulWithFlattenedRhsDims: BMM([b,2048]x[2048,7680] -> [b,7680])
3) ConvertBatchMatMulOp2FullyConnectedOp_Rank2ConstantRhs: FC([b,2048]x[2048,7680] -> [b,7680])
4) StrictQuantizationPattern(by IsDrqTensor): FC([b,1,2048]x[2048,7680] -> [b,7680])

When FC's keep_num_dims is false and it's followed by reshape op (like gemma3n), keep_num_dims will be set to true later with correct shapes by EnableFullyConnectedKeepNumDimsBeforeReshape.

LiteRT-Converter-PiperOrigin-RevId: 847813526
2025-12-22 10:39:30 -08:00
Google AI Edge
9717538247 Migrates builder.create<Op>() => Op::create() in tablegen files
LiteRT-Converter-PiperOrigin-RevId: 847796796
2025-12-22 09:46:43 -08:00
Lu Wang
3cee187ce2 Update LiteRT Readme
LiteRT-PiperOrigin-RevId: 847588728
2025-12-21 21:21:15 -08:00
Quentin Khan
bb01f1a973 Use the XNNPack packing fingerprints to invalidate the weight cache.
PiperOrigin-RevId: 846914182
2025-12-19 17:00:33 -08:00
Chun-Hsueh Lee
34044db27d PR #4837: Qualcomm AI Engine Direct - Add QNN E-wise Max & Div INT16 tests.
Imported from GitHub PR https://github.com/google-ai-edge/LiteRT/pull/4837

Copybara import of the project:

--
12b5bfe82e5d1575df2b49e7dc819a88b5313b61 by chunhsue-qti <chunhsue@qti.qualcomm.com>:

Qualcomm AI Engine Direct - Add QNN E-wise Max & Div INT16 tests.

Co-Authored-By: William Lin <chengwl@qti.qualcomm.com>

Merging this change closes #4837

COPYBARA_INTEGRATE_REVIEW=https://github.com/google-ai-edge/LiteRT/pull/4837 from graham0824:dev/chunhsue/add_op_test 12b5bfe82e5d1575df2b49e7dc819a88b5313b61
LiteRT-PiperOrigin-RevId: 846884452
2025-12-19 15:19:31 -08:00
Google AI Edge
de28f7fcb7 Remove alpha reference and update project status
LiteRT-PiperOrigin-RevId: 846879237
2025-12-19 15:04:23 -08:00
Google AI Edge
1ea9670055 LiteRT DarwiNN: fix default device and memory power states
LiteRT-PiperOrigin-RevId: 846861434
2025-12-19 14:08:48 -08:00
Google AI Edge
bd5bf4866a Apply llvm-use-new-mlir-op-builder fixes
This migrates `builder.create<Op>()` => `Op::create()`

LiteRT-Converter-PiperOrigin-RevId: 846854812
2025-12-19 13:54:06 -08:00
Fengwu Yao
e0e0870385 Internal changes only.
PiperOrigin-RevId: 846835877
2025-12-19 12:56:21 -08:00
Andrew Zhang
506a5b0919 Rename rewriter to builder to reflect its nature.
LiteRT-PiperOrigin-RevId: 846830984
2025-12-19 12:40:20 -08:00
Byungchul Kim
d71bda6ca4 Internal changes only
LiteRT-PiperOrigin-RevId: 846804879
2025-12-19 11:22:58 -08:00
Google AI Edge
330b914b06 Refactor Google Tensor compiler API to pass SOC model via options proto.
The GoogleTensorCompileFlatbuffer API now receives the SOC model information embedded within the GoogleTensorOptions proto instead of as a separate argument. The compiler_plugin populates the GoogleTensorCompilerConfig within the options proto based on the provided SOC model string.

LiteRT-PiperOrigin-RevId: 846789694
2025-12-19 10:45:38 -08:00
Terry Heo
223a1a5ad2 Update LiteRT C++ API doc in Doxygen style
The change will be available via
https://ai.google.dev/edge/api/litert/c

LiteRT-PiperOrigin-RevId: 846788580
2025-12-19 10:40:52 -08:00
Google AI Edge
87404a99f1 Google Tensor Dispatch: support dma-bufs
LiteRT-PiperOrigin-RevId: 846781070
2025-12-19 10:19:50 -08:00
Quentin Khan
78dd9618ed Update tensorflow version to pull latest XNNPack
LiteRT-PiperOrigin-RevId: 846758641
2025-12-19 09:11:35 -08:00