تابع پیش‌بینی‌کننده خطی

در آمار و یادگیری ماشینی، تابع پیش‌بینی‌کننده خطی یک تابع از ترکیب خطی متغیرهای مستقل (حاصل‌جمعِ ضرب یک سری از ضرایب با متغیرهای مستقل) است که برای پیش‌بینی یک متغیر وابسته استفاده می‌شود.^[۱] ساده‌ترین نوع این توابع رگرسیون خطی، که در آن ضرایب، ضرایب رگرسیون نامیده می شوند. با این حال، آنها همچنین در انواع مختلف مدلهای دسته‌بندی مانند رگرسیون لجستیک،^[۲] پرسپرون،^[۳] ماشین‌های بردار پشتیبانی،^[۴] و تجزیه و تحلیل تمایز خطی،^[۵] و همچنین در مدل‌های مختلف دیگر مانند تجزیه و تحلیل مؤلفه اصلی^[۶] و تحلیل عاملی. در بسیاری از این مدلها به ضرایب «وزن» گفته می‌شود.

تعریف ریاضی ویرایش

اگر مجموع متغیرهای مستقل را $x$ بنامیم و متغیر وابسته را با $y$ نمایش دهیم پیش‌بینی $y$ بر اساس $x$ در توابع پیش‌بینی‌کننده خطی به شکل پایین صورت می‌پذیرد. به عبارت ساده‌تر برای پیش‌بینی متغیر وابسته فقط به ترکیب خطی متغیرهای مستقل نیاز است. در این فرمول فرض بر این است که تعداد ابعاد $x$ ، $p$ است:

$f(x)=g\left(\beta _{0}+\beta _{1}x_{1}+\cdots +\beta _{p}x_{p}\right)$

در رگرسیون خطی، تابع $g(\cdot )$ تابع همانی است به این معنی که:^[۷]^[۲]

$f(x)=\beta _{0}+\beta _{1}x_{1}+\cdots +\beta _{p}x_{p}$

در رگرسیون لجیستیک^[۲] تابع $f(\cdot )$ به این شکل تعریف می‌شود، در این فرمول تابع سیگموید احتمال اینکه متغیر وابسته ۱ باشد را از طریق ترکیبی خطی از متغیرهای مستقل تعیین می‌کند:^[۸]

$f(x)={\begin{cases}1,&{\text{if }}\ \sigma \left(\beta _{0}+\beta _{1}x_{1}+\cdots +\beta _{p}x_{p}\right)>0.5\\0,&{\text{otherwise}}\end{cases}}$

در ماشین‌های بردار پشتیبانی $g(\cdot )$ تابع علامت است به این معنی که مقدار متغیر وابسته بسته به اینکه در کدام طرف اَبَرصفحه حاصل از ترکیب خطی متغیرهای مستقل قرار می‌گیرد تعیین می‌شود، در اینجا فرض بر این است که متغیرهای وابسته مقدار مثبت یا منفی یک می‌گیرند:^[۴]

$f(x)={\text{sign}}\left(\beta _{0}+\beta _{1}x_{1}+\cdots +\beta _{p}x_{p}\right)$

تابع تعمیم یافته ویرایش

برای پیش‌بینی بهتر متغیر وابسته گاهی ترکیب خطی از نگاشتی از متغیرهای مستقل را در نظر می‌گیرند نه خود آنها را به این معنی که:

$f(x)=g\left(\beta _{0}+\beta _{1}\phi (x)_{1}+\cdots +\beta _{h}\phi (x)_{h}\right)$

در این تابع $x$ از فضای $p$ بُعدی به یک فضای $h$ بُعدی از طریق نگاشت $\phi$ منتقل شده‌است و سپس در آن فضا مقادیر جدید از طریق ترکیب خطی با هم ترکیب شده‌اند.

به عنوان مثال در رگرسیون خطی تک متغیره می‌توان چندین متغیر وابسته را از طریق یک چند جمله‌ای درجه $h$ حساب کرد، که این کار معادل نگاشت متغیر مستقل به یک فضای $h$ بعدی و انجام رگرسیون در آن فضاست:

$f(x)=\beta _{0}+\beta _{1}x+\beta _{2}x^{2}+\cdots +\beta _{h}x^{h}$

منابع ویرایش

↑ Makhoul, J. (1975). "Linear prediction: A tutorial review". Proceedings of the IEEE. 63 (4): 561–580. doi:10.1109/PROC.1975.9792. ISSN 0018-9219.
↑ ^۲٫۰ ^۲٫۱ ^۲٫۲ David A. Freedman (2009). Statistical Models: Theory and Practice. Cambridge University Press. p. 26. A simple regression equation has on the right hand side an intercept and an explanatory variable with a slope coefficient. A multiple regression equation has two or more explanatory variables on the right hand side, each with its own slope coefficient
↑ Rosenblatt, Frank (1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.
↑ ^۴٫۰ ^۴٫۱ Cortes, Corinna; Vapnik, Vladimir N. (1995). "Support-vector networks" (PDF). Machine Learning (journal)|Machine Learning. 20 (3): 273–297. CiteSeerX 10.1.1.15.9362. doi:10.1007/BF00994018.
↑ McLachlan, G. J. (2004). Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience. ISBN 978-0-471-69115-0.
↑ Jolliffe I.T. Principal Component Analysis, Series: Springer Series in Statistics, 2nd ed. , Springer, NY, 2002, XXIX, 487 p. 28 illus. شابک ‎۹۷۸−۰−۳۸۷−۹۵۴۴۲−۴
↑ Rencher, Alvin C.; Christensen, William F. (2012), "Chapter 10, Multivariate regression – Section 10.1, Introduction", Methods of Multivariate Analysis, Wiley Series in Probability and Statistics, vol. 709 (3rd ed.), John Wiley & Sons, p. 19, ISBN 978-1-118-39167-9.
↑ Walker, SH; Duncan, DB (1967). "Estimation of the probability of an event as a function of several independent variables". Biometrika. 54 (1/2): 167–178. doi:10.2307/2333860. JSTOR 2333860.

[1] Makhoul, J. (1975). "Linear prediction: A tutorial review". Proceedings of the IEEE. 63 (4): 561–580. doi:10.1109/PROC.1975.9792. ISSN 0018-9219.

[Freedman09-2] ۲٫۰ ^۲٫۱ ^۲٫۲ David A. Freedman (2009). Statistical Models: Theory and Practice. Cambridge University Press. p. 26. A simple regression equation has on the right hand side an intercept and an explanatory variable with a slope coefficient. A multiple regression equation has two or more explanatory variables on the right hand side, each with its own slope coefficient

[3] Rosenblatt, Frank (1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.

[CorinnaCortes-4] ۴٫۰ ^۴٫۱ Cortes, Corinna; Vapnik, Vladimir N. (1995). "Support-vector networks" (PDF). Machine Learning (journal)|Machine Learning. 20 (3): 273–297. CiteSeerX 10.1.1.15.9362. doi:10.1007/BF00994018.

[McLachlan:2004-5] McLachlan, G. J. (2004). Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience. ISBN 978-0-471-69115-0.

[Principal_Component_Analysis-6] Jolliffe I.T. Principal Component Analysis, Series: Springer Series in Statistics, 2nd ed. , Springer, NY, 2002, XXIX, 487 p. 28 illus. شابک ‎۹۷۸−۰−۳۸۷−۹۵۴۴۲−۴

[7] Rencher, Alvin C.; Christensen, William F. (2012), "Chapter 10, Multivariate regression – Section 10.1, Introduction", Methods of Multivariate Analysis, Wiley Series in Probability and Statistics, vol. 709 (3rd ed.), John Wiley & Sons, p. 19, ISBN 978-1-118-39167-9.

[wal67est-8] Walker, SH; Duncan, DB (1967). "Estimation of the probability of an event as a function of several independent variables". Biometrika. 54 (1/2): 167–178. doi:10.2307/2333860. JSTOR 2333860.

[۱]

[۲]

[۳]

[۴]

[۵]

[۶]

[۷]

[۸]