MMPF: Multi-Modal Perception Framework for Abnormal Medical Condition Detection

Authors: Chuyi Zhong, Dingkang Yang, Peng Zhai, Lihua Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of this approach, showing improved performance over existing methods.
Researcher Affiliation Academia 1Academy for Engineering and Technology, Fudan University 2Cognition and Intelligent Technology Laboratory (CIT Lab) 3Institute of Metaverse & Intelligent Medicine, Fudan University 4Jilin Provincial Key Laboratory of Intelligence Science and Engineering, Changchun, China 5Engineering Research Center of AI and Robotics, Ministry of Education, Shanghai, China EMAIL
Pseudocode No The paper describes methodologies and architecture in text and diagrams (Figure 1), but does not contain a dedicated pseudocode or algorithm block.
Open Source Code No The paper mentions releasing a novel dataset but does not provide any statement about releasing source code or a link to a code repository.
Open Datasets Yes A novel dataset has been released, encompassing a wide range of medical conditions, providing a robust database for future medical condition detection tasks. This study develops and releases a novel image dataset to address the lack of benchmark datasets for medical condition detection.
Dataset Splits Yes The total size of the dataset comes to 10,500 images, with a split ratio of 70%:15%:15%.
Hardware Specification Yes training is conducted on two Nvidia Quadro RTX 8000 GPUs.
Software Dependencies No The framework is implemented using Tensor Flow (Pang, Nijkamp, and Wu 2020) with Keras (G eron 2022). The paper mentions TensorFlow and Keras but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes The Adam W optimizer (Loshchilov and Hutter 2017) is employed, with an initial learning rate of 1e 3 and weight decay of 1e 4. Models are trained with a batch size of 32 for 50 epochs. The input dimensions are set to (3, 224, 224) for images and (1, 3, 17) for 3D posture coordinates.