-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 55 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 21 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 7
https://huggingface.co/papers/2402.18668\n","text":"similar https://huggingface.co/papers/2402.18668\n"},"id":"2402.19427","title":"Griffin: Mixing Gated Linear Recurrences with Local Attention for\n Efficient Language Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2402.19427.png","upvotes":55,"publishedAt":"2024-02-29T18:24:46.000Z","isUpvotedByUser":false},{"_id":"65e1cb60443c7fb8455c170d","position":1,"type":"paper","id":"2402.18668","title":"Simple linear attention language models balance the recall-throughput\n tradeoff","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2402.18668.png","upvotes":20,"publishedAt":"2024-02-28T19:28:27.000Z","isUpvotedByUser":false},{"_id":"65e1cca27754e5da55123a3e","position":2,"type":"paper","id":"2402.15220","title":"ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and\n Two-Phase Partition","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2402.15220.png","upvotes":21,"publishedAt":"2024-02-23T09:29:19.000Z","isUpvotedByUser":false},{"_id":"65e1cde264802b4547f029e2","position":3,"type":"paper","id":"2402.14180","title":"Linear Transformers are Versatile In-Context Learners","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2402.14180.png","upvotes":7,"publishedAt":"2024-02-21T23:45:57.000Z","isUpvotedByUser":false}],"position":1,"theme":"pink","private":false,"shareUrl":"https://huggingface.co/collections/kiranr/papers-65b007e5a037572db142e459","upvotes":0,"isUpvotedByUser":false}],"datasets":[],"models":[{"author":"kiranr","authorData":{"_id":"621d6f532165dc431641e438","avatarUrl":"/avatars/56ccef10a8426d7160ef3586a771bd63.svg","fullname":"Kiran Kamble","name":"kiranr","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":8},"downloads":0,"gated":false,"id":"kiranr/gpt2-tokenizer","availableInferenceProviders":[],"lastModified":"2023-06-20T04:55:42.000Z","likes":0,"private":false,"repoType":"model","isLikedByUser":false}],"numberLikes":80,"papers":[{"id":"2502.06329","title":"Expect the Unexpected: FailSafe Long Context QA for Finance","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2502.06329.png","upvotes":126,"publishedAt":"2025-02-10T10:29:28.000Z","isUpvotedByUser":false},{"id":"2408.14906","title":"Writing in the Margins: Better Inference Pattern for Long Context\n Retrieval","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2408.14906.png","upvotes":140,"publishedAt":"2024-08-27T09:34:38.000Z","isUpvotedByUser":false},{"id":"2402.17553","title":"OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist\n Autonomous Agents for Desktop and Web","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2402.17553.png","upvotes":24,"publishedAt":"2024-02-27T14:47:53.000Z","isUpvotedByUser":false},{"id":"2307.03692","title":"Becoming self-instruct: introducing early stopping criteria for minimal\n instruct tuning","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2307.03692.png","upvotes":26,"publishedAt":"2023-07-05T09:42:25.000Z","isUpvotedByUser":false}],"posts":[],"totalPosts":0,"spaces":[],"u":{"avatarUrl":"/avatars/56ccef10a8426d7160ef3586a771bd63.svg","isPro":false,"fullname":"Kiran Kamble","user":"kiranr","orgs":[{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1625001569797-60db8b5ad8b4797b129145d5.png","fullname":"Writer","name":"Writer","userRole":"write","type":"org","isHf":false,"details":"AGI, LLMs, Knowledge Graph, Palmyra, Domain Specific LLM","isEnterprise":true,"numUsers":119}],"signup":{"github":"ki6an","details":"nlp,llm","homepage":"","twitter":""},"isHf":false,"isMod":false,"type":"user"},"upvotes":20,"repoFilterModels":{"sortKey":"modified"},"repoFilterDatasets":{"sortKey":"modified"},"repoFilterSpaces":{"sortKey":"modified"},"numFollowers":8,"numFollowingUsers":0,"numFollowingOrgs":1,"isFollowing":false,"isFollower":false,"sampleFollowers":[{"user":"fibrosis","fullname":"Fatime","type":"user","_id":"6405dab8692855e65adf0a99","isPro":false,"avatarUrl":"/avatars/d77917bce9600d6218f1fa4b76e317cf.svg"},{"user":"samjulien","fullname":"Sam Julien","type":"user","_id":"666d1e4e2b9e45273912c14a","isPro":true,"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/666d1e4e2b9e45273912c14a/FffOTN2hceaGWWoGqnJZW.jpeg"},{"user":"tolgacangoz","fullname":"Tolga Cangöz","type":"user","_id":"603bdba23249b99991dbcbc4","isPro":false,"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/603bdba23249b99991dbcbc4/cxCnN1H-RXOhojHY3Wcxo.jpeg"},{"user":"kirkg","fullname":"kirk goddard","type":"user","_id":"634ea96ed049354d7ee2e11d","isPro":false,"avatarUrl":"/avatars/a985eb9ae94cc621adabd10765b1c450.svg"}],"isWatching":false,"hardwareItems":[{"sku":["GPU","NVIDIA","H100"],"mem":80,"num":32}],"acceptLanguages":["en","*"]}">
View all activity
Kiran Kamble
kiranr
AI & ML interests
nlp,llm
Recent Activity
new activity
17 days ago
Writer/palmyra-large:Adding `safetensors` variant of this model
authored
a paper
30 days ago
Expect the Unexpected: FailSafe Long Context QA for Finance
Organizations
Collections
1
models
1
datasets
None public yet