Eight Music Emotions Recognition System using Neural Network with Cascaded model

Kanawat Sorussa

Please use this identifier to cite or link to this item: http://kb.psu.ac.th/psukb/handle/2016/12953

Title:	Eight Music Emotions Recognition System using Neural Network with Cascaded model
Other Titles:	ระบบรู้จำแปดอารมณ์ดนตรี โดยวิธีการ โครงข่ายประสาทเทียม แบบโครงสร้างน้ำตก
Authors:	Montri Karnjanadecha Kanawat Sorussa Faculty of Engineering (Computer Engineering) คณวิศวกรรมศาสตร์ ภาควิชาวิศวกรรมคอมพิวเตอร์
Keywords:	Neural networks (Computer science)
Issue Date:	2019
Publisher:	Prince of Songkla University
Abstract:	Music selection is difficult without an efficient organization based on metadata or tags, and one effective tag scheme is based on the emotion expressed by the music. The main drawback of such a system is that manually tagging music files because tagging a large number of files is a tedious work and emotional perception of each person is different. Therefore, this thesis presents a music emotion classification system for eight emotional classes with cascaded model. Russell's emotion model was adopted as a common ground for emotional annotation. The system implements on MATLAB using MIR toolbox to extract acoustic features from audio files and employed a supervised machine learning technique to recognize acoustic features to create predictive models. Four predictive models were proposed and compared. The models were composed by crossmatching two types of neural networks, i.e., Levenberg-Marquardt (LM) and resilient backpropagation (Rprop) with two types of structures: a traditional multiclass unit and multiple units of binary-class with a cascaded structure. The performance of each model was evaluated via the DEAM benchmark. The best result was achieved by the model trained with a cascaded Rprop neural network (accuracy of 89.5%). In addition, correlation coefficient analysis showed that timbre features were the most impactful for prediction. Our work offers an opportunity for a competitive advantage because only a few music providers currently tag music with emotional terms.
Abstract(Thai):	การเลือกเพลงให้ตรงกับความต้องการนั้นไม่ใช่เรื่องง่าย หากเพลงเหล่านั้นไม่ได้รับ การจัดหมวดหมู่มาก่อนโดยใช้ข้อมูลอภิพันธุ์ การจัดหมวดหมู่รูปแบบหนึ่งที่มีประสิทธิภาพคือ การจัด หมวดหมู่ตามอารมณ์ดนตรี แต่ทว่าการจัดหมวดหมู่รูปแบบดังกล่าวด้วยมนุษย์ สําหรับไฟล์เพลง จํานวนมหาศาลนั้น ไม่อาจจะกระทําได้อย่างมีประสิทธิภาพ เนื่องจากความเหนื่อยล้า และจินตคติที่ แตกต่างกันไปในแต่ละบุคคล งานวิจัยชิ้นนี้จึงขอเสนอ ระบบจําแนก และตัวแบบ แบบน้ําตก สําหรับ ใช้ในการจําแนกเพลงออกเป็นแปดกลุ่มของอารมณ์ดนตรี โดยได้อ้างอิงความหมายของอารมณ์ต่างๆ ตามแบบของ “รัสเซลล์” เพื่อใช้เป็นหลักเกณฑ์ในการจําแนกเพลงไปตามหมวดหมู่ต่างๆ ระบบ ต้นแบบพัฒนาบน “แมทแลป” โดยใช้ “เอ็มไออาร์ ทูลบอกซ์” เป็นเครื่องมือสกัดคุณลักษณะทางเสียง ตัวแบบสี่แบบถูกเปรียบเทียบ โดยใช้โครงข่ายประสาทเทียมสองชนิด (ประสาทเทียมตามแบบของ “ลีเวนเบิร์ก-มัลลิการ์ท” และประสาทเทียมแบบยืดหยุ่น) จับคู่กับวิธีจัดโครงสร้างสองวิธี (โครงข่าย ประสาทเทียมแบบดั้งเดิมและ แบบหลายหน่วยโดยใช้โครงสร้างแบบน้ําตก) ประสิทธิภาพของตัวแบบ ถูกประเมินด้วย “ดื่ม เบนช์มาร์ค” ผลลัพธ์แสดงให้เห็นว่า กรรมวิธีสังเคราะห์ตัวแบบด้วยโครงข่าย ประสาทเทียมแบบยืดหยุ่นหลายหน่วยแบบน้ําตก ให้ความถูกต้องอยู่ที่ 89.5 เปอร์เซ็นสําหรับการ จําแนกเป็นแปดกลุ่มอารมณ์ อีกทั้งจากการวิเคราะห์ค่าสหสัมพันธ์ของคุณลักษณะทางเสียงแสดงให้ เห็นว่า คุณลักษณะประเภท อัตลักษณ์ของเสียงจากเครื่องดนตรี มีผลต่อการทํานายมากที่สุด งานวิจัย ชิ้นนี้สามารถสร้างความได้เปรียบในการแข่งขันให้แก่ผู้ให้บริการดนตรีได้ เนื่องจากทุกวันนี้ยังคงมีผู้ ให้บริการดนตรีจํานวนไม่มากนัก ที่จัดหมวดหมู่ดนตรีตามอารมณ์ดนตรี
Description:	Thesis (M.Eng., Computer Engineering)--Prince of Songkla University, 2019
URI:	http://kb.psu.ac.th/psukb/handle/2016/12953
Appears in Collections:	241 Thesis

Files in This Item:

File	Description	Size	Format
436815.pdf		3.03 MB	Adobe PDF	View/Open

Show full item record