Amirali Gatmiri

Hello, I'm Amirali.

I'm a software developer and machine learning researcher with a passion for Deep Learning and Natural Language Processing.

Check out my projects

Email | LinkedIn | GitHub | CV

About Me

I am passionate about solving real-world problems using deep learning and software engineering. My work spans building AI-powered tools, benchmarking models, and exploring innovative solutions in Natural Language Processing, speech recognition and network data. I am currently working at DoneX.

I have worked on network data analysis at Information, Network & Learning Lab under the supervision of Dr. Mahdi Jafari Siavoshani. I also have worked in the field of natural language Processing at Asr Gooyesh, under the supervision of Dr. Hossein Sameti.

Selected Projects

Multi-Document Q&A via Semantic Retrieval

We build a system that answers user questions by searching across multiple documents. First, we embed all document chunks and compare them to the question’s embedding to find the most relevant pieces. Then, we prompt a language model with both the question and the retrieved chunks. The answer is generated with full respect to context, and returned to the user—efficient, scalable, and context-aware.

ASR Benchmark with New Data

We propose a method to benchmark different Automated Speech Recognition models with our new collected testset. We also see that our new wav2vec2 model [utilizing language models & ngram] performs better on this dataset than other state-of-the-art models.

Causal Inference on Communication Data

Utilizing the Granger Causality test, we understand how each telecommunication node [representing a province] influences and is influenced by other nodes over time. Data is of call, message and mobile data usage, reported daily for each province during a 5-year period. We infer both correlation and causality influence levels.

TLS Info Leakage with ML

Using Machine Learning methodologies, we propose a web fingerprinting attack, in which we try to extract information about users' data, specifically which website the user is visiting. We then find the source of leakage using ML interpretation, and mask those parts. Afterwards, we try information extraction again. With each try, we see that information leakage reduces, and we learn about TLS's leaker parts.