This was something I posted early Mar on Mamba only due to picking up link with Rudy at BRN playing with it.
Saw a couple of our employees "liking" mamba and wondered what it was.
Couple snips below and curious whether it was a general "like", great development thing or whether it is something they are now working on too.
Is possible the PeaBrane/Mamba-Tiny on Git and liked is something that Rudy created by the looks from the full Mamba but maybe I'm reading it wrong?
Interesting none the less I think given the Ai in 24 links Mamba, Transformers & Neuromorphic all together.AI in 2024 – On an Exponential Rise: Data, Mamba, and More | YouTube inside
Discover the transformative potential of AI in 2024. Dive into key drivers like data quality and the groundbreaking Mamba architecture.meta-quantum.today
Mamba: This refers to the emergence of new, ground breaking AI architectures like transformers and neuromorphic computing. These architectures mimic the human brain’s structure and function, allowing for significantly faster processing and deeper learning capabilities. Mamba-based models will revolutionize areas like natural language processing, image recognition, and robotics.
- Mamba Architecture: Revolutionizing Sequence Modelling
- Mamba, a ground breaking architecture, represents a leap forward from the Transformer models.
- It addresses the computational challenges of large-scale sequence processing.
- Albert Goo’s work on structured state spaces inspired Mamba’s development.
- The architecture’s potential lies in its ability to handle extensive sequences, as demonstrated in DNA classification tasks.
[Submitted on 1 Dec 2023]Mamba: Linear-Time Sequence Modelling with Selective State Spaces
Albert Gu, Tri DaoFoundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time architectures such as linear attention, gated convolution and recurrent models, and structured state space models (SSMs) have been developed to address Transformers' computational inefficiency on long sequences, but they have not performed as well as attention on important modalities such as language. We identify that a key weakness of such models is their inability to perform content-based reasoning, and make several improvements. First, simply letting the SSM parameters be functions of the input addresses their weakness with discrete modalities, allowing the model to selectively propagate or forget information along the sequence length dimension depending on the current token. Second, even though this change prevents the use of efficient convolutions, we design a hardware-aware parallel algorithm in recurrent mode. We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even MLP blocks (Mamba). Mamba enjoys fast inference (5× higher throughput than Transformers) and linear scaling in sequence length, and its performance improves on real data up to million-length sequences. As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics. On language modeling, our Mamba-3B model outperforms Transformers of the same size and matches Transformers twice its size, both in pretraining and downstream evaluation.Keith Johnson - BrainChip | LinkedIn
I have a passion for machine learning and artificial intelligence. I'm interested in… · Experience: BrainChip · Education: The University of Western Australia · Location: Greater Perth Area · 195 connections on LinkedIn. View Keith Johnson’s profile on LinkedIn, a professional community of 1...au.linkedin.com
Rudy Pei - BrainChip | LinkedIn
I have a passion for ML research and engineering. In particular, I love efficient models… · Experience: BrainChip · Education: University of California San Diego · Location: San Diego · 500+ connections on LinkedIn. View Rudy Pei’s profile on LinkedIn, a professional community of 1 billion members.www.linkedin.com
Rudy Pei
Physicist | ML researcher | quantum & neuromorphic computing | behavioral economics | composer
3w Edited
Mamba is a new state-space model out-performing transformers on "everywhere tried". Originally, it was trained with associative scan, which pytorch does not support natively, hence the need for custom CUDA kernels. However, there is a simple math trick to express the associative scans used in mamba as a ratio of two cumulative sums. This makes for an efficient native pytorch implementation of mamba possible. How? Check out my simple repo with an one-file implementation of this idea forking from the mamba-minimal repo https://lnkd.in/g5QR7yHC #mamba #llmGitHub - PeaBrane/mamba-tiny: Simple, minimal implementation of the Mamba SSM in one file of PyTorch. More efficient than the minimalist version but less efficient than the original mamba implementation.
github.com
- Forums
- ASX - By Stock
- TENNS
Keith Johnson - BrainChip | LinkedInI have a passion for machine...
-
- There are more pages in this discussion • 24 more messages in this thread...
You’re viewing a single post only. To view the entire thread just sign in or Join Now (FREE)
Featured News
Add BRN (ASX) to my watchlist
|
|||||
Last
21.5¢ |
Change
0.005(2.38%) |
Mkt cap ! $399.0M |
Open | High | Low | Value | Volume |
21.5¢ | 22.0¢ | 20.8¢ | $1.018M | 4.810M |
Buyers (Bids)
No. | Vol. | Price($) |
---|---|---|
2 | 525000 | 21.0¢ |
Sellers (Offers)
Price($) | Vol. | No. |
---|---|---|
21.5¢ | 430424 | 16 |
View Market Depth
No. | Vol. | Price($) |
---|---|---|
2 | 525000 | 0.210 |
28 | 799097 | 0.205 |
82 | 1077375 | 0.200 |
20 | 548382 | 0.195 |
27 | 559065 | 0.190 |
Price($) | Vol. | No. |
---|---|---|
0.215 | 430424 | 16 |
0.220 | 859616 | 26 |
0.225 | 228498 | 9 |
0.230 | 232250 | 16 |
0.235 | 651995 | 10 |
Last trade - 16.10pm 17/06/2024 (20 minute delay) ? |
|
|||||
Last
21.3¢ |
  |
Change
0.005 ( 1.09 %) |
|||
Open | High | Low | Volume | ||
21.0¢ | 21.8¢ | 20.8¢ | 2619612 | ||
Last updated 15.59pm 17/06/2024 ? |
Featured News
BRN (ASX) Chart |
The Watchlist
LPM
LITHIUM PLUS MINERALS LTD.
Simon Kidston, Non Executive Director
Simon Kidston
Non Executive Director
SPONSORED BY The Market Online