Picture for Li Lyna Zhang

Li Lyna Zhang

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Viaarxiv icon

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Add code
Feb 21, 2024
Viaarxiv icon

Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning

Add code
Dec 26, 2023
Figure 1 for Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning
Figure 2 for Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning
Figure 3 for Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning
Figure 4 for Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning
Viaarxiv icon

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

Add code
Oct 11, 2023
Viaarxiv icon

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

Add code
Jun 26, 2023
Figure 1 for Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Figure 2 for Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Figure 3 for Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Figure 4 for Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Viaarxiv icon

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Add code
May 31, 2023
Figure 1 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 2 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 3 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 4 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Viaarxiv icon

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Add code
Mar 21, 2023
Figure 1 for ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
Figure 2 for ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
Figure 3 for ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
Figure 4 for ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
Viaarxiv icon

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference

Add code
Mar 15, 2023
Figure 1 for SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Figure 2 for SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Figure 3 for SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Figure 4 for SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Viaarxiv icon

LUT-NN: Towards Unified Neural Network Inference by Table Lookup

Add code
Feb 07, 2023
Figure 1 for LUT-NN: Towards Unified Neural Network Inference by Table Lookup
Figure 2 for LUT-NN: Towards Unified Neural Network Inference by Table Lookup
Figure 3 for LUT-NN: Towards Unified Neural Network Inference by Table Lookup
Figure 4 for LUT-NN: Towards Unified Neural Network Inference by Table Lookup
Viaarxiv icon

Boosting Mobile CNN Inference through Semantic Memory

Add code
Dec 05, 2021
Figure 1 for Boosting Mobile CNN Inference through Semantic Memory
Figure 2 for Boosting Mobile CNN Inference through Semantic Memory
Figure 3 for Boosting Mobile CNN Inference through Semantic Memory
Figure 4 for Boosting Mobile CNN Inference through Semantic Memory
Viaarxiv icon