****************************************************************************
*************** 中国机器学习邮件列表 第3卷 第5期 2003年5月 ***************
****************************************************************************
--------------------------------------------
[目录]
--------------------------------------------
[编者按]
到现在为止MLChina已经有413人订阅了,每位同仁上网条件不一样,有的同仁条件好,但是也有同仁在某种情况下,有一些不方便的时候,比如出差在外,modem上网,这样如果有很大的附件就不容易下载,而且信箱容易爆满,相信大家或多或少遇到过,所以我代表这些部分呼吁大家尽量少用附件。方法或者是在网上贴出链接,或者点对点发送,或者其它聪明的各位能够想到的方法。
我能够提供的条件是我的FTP服务器ftp://202.120.8.48/inconing,缺点是大陆IP,只能大陆访问,
匿名可以上传下载,另一个是南京大学小百合数据挖掘板上载区http://datamining@bbs.nju.edu.cn,上载之后可以提供链接。如果有其它的方法请各位同仁不吝提供。
由于编者4月份出差后被隔离,列表邮件保存的不全,所以本期精华版的编辑主要是依据www.cn99.com,部分讨论没有注明来源,请多多原谅,如果您需要注明的话,敬请跟我联系lgz@sjtu.edu.cn。在此也提醒大家注意以后发文请加注自己email地址,多谢各位配合。
值此非典流行之际,也祝各位身体健康,工作顺利! 李国正
会议信息
神经网络及其应用研讨会
CFP: CIRAS'03
CFP: APBC2004
APLAS'03
社团信息
2003 CDMS-Newsletter 第8期目錄
问题讨论
关于数据预处理~`
最小二乘线性拟合算法程序
海量数据的分类问题
ER vs. EAV
演绎数据库中数据挖掘??
SVM中的支持向量
关于决策树的问题
请教有关贝叶斯网络的问题
中文文本分类的语料库
贝叶斯网络的工具?
请问:库仑模型:
*.data,*.test,*.names这样类型的文件是怎样生成的
********************************************
[编者按]
--------------------------------------------
From: sgshan
Sent: 2003-05-06 19:09:44.0
Subject: 强烈建议不要发送大的附件
各位同仁,
再次强烈建议不要发送大的邮件附件。因为很多人在家里用电话上网,一个1M多的东西可能要下载半个多小时,这很浪费时间和金钱。建议点对点的发送给需要的同仁。
管理员可否严格管理一下?非常感谢!
祝大家身体健康!
sgshan
********************************************
[会议信息]
--------------------------------------------
From: zhouzh@nju.edu.cn
Sent: 2003-04-12 14:20:22.0
Subject: 神经网络及其应用研讨会
--------------------------------------------
From: zhouzh@nju.edu.cn
Sent: 2003-04-14 16:56:33.0
Subject: CFP: CIRAS'03
2nd Int. Conference on Computational Intelligence, Robotics
and
Autonomous Systems 15-18 December 2003, Singapore
[http://ciras.nus.edu.sg]
[ciras@nus.edu.sg]
Online Paper Submission: [http://act.ee.nus.edu.sg/ciras2003/]
Important Dates
Submission: 1 July 2003
Acceptance: 15 August 2003
Final Submission: 15 September 2003
Organized by: Centre for Intelligent Control, National Univ. of
Singapore Co-sponsored by:
IEEE SMC Society Singapore Chapter
IEEE R&A Society Singapore Chapter
Supported by: Lee Foundation
The Centre for Intelligent Control, National University of Singapore is
pleased to announce that the 2nd International conference on
Computational Intelligence, Robotics and Autonomous Systems (CIRAS 2003)
is planned in December 2003 in Singapore. The CIRAS'2001 was
successfully held at the National University of Singapore in November
2001. Prof. Zadeh L. A. and Prof. Xin Yao delivered the keynote
addresses at CIRAS'2001.
The Intelligence in automation systems is increasingly becoming a key
and important technology to be harnessed for enhancing productivity and
economic returns. CIRAS2003 will focus on research directions that are
broadly covered by the fields, Computational Intelligence (CI), Robotics
and Autonomous Systems. CIRAS is intended to provide a common platform
for knowledge dissemination among researchers working in related areas.
CIRAS invites submissions from all areas related to, but not limited to,
Computational Intelligence, Robotics and Autonomous Systems.
Intelligent Control
Real Time Control
DNA Computing
Life Sciences
Fuzzy Systems
Neuro-Fuzzy Systems
Neural Networks (NN)
Autonomous Systems
Multi-Agent Systems (MAS)
System Design Automation
Robotics, Humanoids
Sensor Fusion
Cooperative Robotics
Robot Soccer Systems
Evolutionary Robotics
Evolvable Hardware
Distributed Systems
Embedded Systems
Non-Linear Systems
Educational Technology
Rough Sets, Data Mining
Power Systems
Genetic Algorithm (GA)
Evolutionary Computation (EC)
Hybrid CI Algorithms
Distributed Evolutionary Algorithms
Real Time Evolutionary Computation
Evolutionary Logistics
Evolutionary Systems
Multi-Objective Evolutionary Algorithm
Paper Submission
Authors are invited to submit the complete manuscript online
[http://act.ee.nus.edu.sg/ciras2003/].
Manuscripts are limited to six
(6) A4 size pages. The first page should contain the title of the paper
and, full name(s) and address(es) of the author(s). The complete
mailing address(es), telephone number(s), FAX number(s) and email
address(es) should be provided on a separate sheet. Please submit the
complete manuscript in PDF format. The LaTex format is available on the
web.
Special Sessions
CIRAS solicits special session proposals. The special sessions are
intended to usher in, in-depth discussions in special areas relevant to
the conference theme. The session organizers will coordinate the
associated review process. The conference proceedings will include all
papers from the special sessions.
For further details, please contact:
CIRAS 2003 Conference Secretariat
c/o Integrated Meetings Specialist Pte Ltd
1122A Serangoon Road, Singapore 328206
Tel.: +65-62955790 Fax: +65-62955792
E-mail: [ciras@inmeet.com.sg]
返回目录
From: "Zhi-Hua Zhou" <zhouzh@nju.edu.cn>
Sent: 2003-04-16 14:30:44.0
Subject: CFP: APBC2004
------------------CALL FOR PAPERS---------------------------------
The Second Asia-Pacific Bioinformatics Conference
Within Australia Computer Science Week
Dunedin, New Zealand, 18-22 January 2004
http://www.fit.qut.edu.au/~chenp/APBC2004
------------------------------------------------------------------
The Asia-Pacific Bioinformatics Conference series is an annual
forum for exploring research,
development and novel applications of Bioinformatics.
The Second Asia-Pacific Bioinformatics
Conference, APBC'04, will be held at the University of
Otage, New Zealand as part of the Australasian Computer
Science Week.
The conference week also includes:
-the Australasian Computer Science Conference (ACSC'04),
-the Australasian Database Conference (ADC'04),
-Computing: The Australasian Theory Symposium (CATS'04),
-the Australasian User Interface Conference (AUIC'04).
..................................................
As with previous years, registration to the Asia-Pacific Bioinformatics Conference will enable delegates to attend sessions in any conference participating in the Australasian Computer Science Week.
Key Information
The importance of bioinformatics is growing rapidly and the
volume of biological data is increasing exponentially.
These data are characterized by variety and heterogeneity:
they are related to different organic structures, environments,
and spatial scales, and derive from multiple sources.
The aim of this conference is to provide an international forum for
researchers, professionals, and industrial practitioners to share
their knowledge of how to surf this tidal wave of information.
Database management, artificial intelligence, data mining, and
knowledge representation can provide key solutions to the
challenges presented by biological data. These approaches require
powerful and sophisticated computational tools to provide efficient
solutions to very complex problems.
Exciting opportunities are emerging for integrating molecular biology components
of bioinformatics with computational, physiological,
morphological, taxonomic, and ecological components.
Addressing this challenge will facilitate the way life science
researches access, retrieve, analyze and visualize data and
relationships in a collaborative work environment.
We invite submissions that address conceptual and practical
issues of bioinformatics.
The main objective of this conference is to bring together
researchers in all aspects of computer-based management of
biological data.
Topics of Interest
Papers are solicited on, but not limited to, the following topics:
-Bioinformatics Applications
-Computational Analysis of Biological Data
-Scientific/Biological Visualization
-Fuzzy Logic for Biological Data
-Bioinformatics Data Mining & Statistical Modeling
-Complex Data Input
-Intelligent Biological Systems
-Biological Database Migration and Integration
-Biological Data Intensive XML / Web-based Biological Data Access
-Machine Learning for Bioinformatics
-High Dimensional Biological Indexing and Similarity Search
GENERAL CHAIR
Satoru Miyano University of Tokyo, Japan
PROGRAM CHAIR
Yi-Ping Phoebe Chen
Queensland University of Technology , Australia
PROGRAM COMMITTEE
Catherine Abbott, Flinders University, Australia
Philip Bourne, Uni of California, USA
Matthew Bellgard, Murdoch Uni, Australia
Kevin Burrage, Uni of Queensland, Australia
Shi-Kuo Chang, Uni of Pittsburgh, USA
Sung-Bae Cho, Yonsei Uni, Korea
Ross Coppel, Monash Uni, Australia
David Dagan Feng, PolyTech Uni, Hong Kong
Gavin Huttley, Au. National Uni, Australia
Hasan Jamil, Mississippi State Uni, USA
Sang Yup Lee, KAIST, Korea
Minoru Kanehisa, Kyoto University, Japan
Nik Kasabov, Auckland Uni of Tech, NZ
Ashok Kolaskar, Uni of Pune, India
Tim Littlejohn, BioLateral Ltd, Australia
Ming Li, University of Waterloo, USA
JingChu Luo, Peking University, China
Satoru Miyano, University of Tokyo, Japan
Pavel Pevzner, University of California, USA
Michael Poidinger, ANGIS, Australia
Allen Rodrigo, Uni of Auckland, NZ
Mark Ragan, Uni of Queensland, Australia
Shoba Ranganathan, APBioNet, Singapore
M. Vidyasagar, Tata Services, India
Tan Tin Wee, University of Singapore
Limsoon Wong, I2R, Singapore
Hong Yan, City Uni of HK, Hong Kong
Ueng-Cheng Yang, Nat Yang-Ming Uni, Taiwan
Ren Zhang, Wollongong Uni, Australia
Xiaofang Zhou, Uni of Queensland, Australia
ORGANIZATION CO-CHAIRS
Ian MacDonald, University of Otago, NZ
Mike Atkinson, University of Otago, NZ
IMPORTANT DATES
Submission of Abstract: Fri, 29 Aug 2003
Submission of full papers: Fri, 5 Sept 2003
Notification of acceptance: Fri, 17 Oct 2003
Camera-ready copy: Fri, 14 Nov 2003
Author registration: Fri, 14 Nov 2003
Conference: 18-22 Jan 2004
SUBMISSION GUIDELINES
APBC2004 invites high-quality original papers on any topic related to Bioinformatics.
Papers should be no more than 10 pages in length conforming to the
formatting instructions for the series Conferences in Research and
Practice in Information Technology (instructions available at the
APBC2004 website).
Each paper will be fully refereed by an international program committee.
Papers will be judged on originality, significance, correctness and clarity.
Authors should submit one (1) copy of a PDF or MS Word file to
"APBC2004 Paper Submission Website".
The title and abstract should be submitted before 29 Aug 2003, and the full
paper must be submitted by 5 Sep 2003. .
The proceedings of the conference will be published through the
Australian Computer Society as the Second Asia-Pacific Bioinformatics Conference
2004 - Australian Computer Science Communications.
The best papers are being offered the opportunity to revise
and resubmit their paper for publication in the Journal of
Research and Practice in Information Technology.
To publish the paper in the conference, one of authors needs
to register and present in the conference.
THE ADDRESS FOR ALL CORRESPONDENCE IS AS FOLLOWS:
Dr. Phoebe Chen
Program Chair of APBC2004
Faculty of Information Technology,
Queensland University of Technology
Brisbane, QLD 4001, Australia, Phone: +61 7 3864-9482 Fax: +61 7 38649390
email: p.chen@qut.edu.au
--------------------------------------------
From: "Maoql" <maoql@nlsde.buaa.edu.cn>
Sent: 2003-04-16 15:57:49.0
Subject: APLAS'03
Call for Papers
The First Asian Symposium on
Programming Languages and Systems (APLAS'03)
Beijing, China. October 27-29, 2003
Sponsored by the Asian Association for Foundations of Software (AAFS) and Beihang
University (Beijing University of Aeronautics and Astronautics)
--------------------------------------------------------------------------------
Description
APLAS aims at stimulating programming language research by providing a forum
for the presentation of recent results and the exchange of ideas and experience
in topics concerned with programming languages and systems. APLAS is based in
Asia, but intends to be an international forum that serves the worldwide programming
languages community.
The APLAS series is sponsored by the Asian Association for Foundation of Software
(AAFS), which has recently been founded by Asian researchers in cooperation
with many researchers from Europe and the USA. APLAS has been discussed and
prepared through informal workshops held in Singapore (2000), Daejeon (2001),
and Shanghai (2002). APLAS03 will be the first formal symposium in the series.
Topics
The symposium is devoted to foundational issues in programming languages and
systems, covering the following areas:
semantics and theoretical foundations
type systems and language design
compilers and implementation
program analysis and security
program transformation and calculation
concurrency
General Chair:
Wei Li
National Laboratory of Software Development Environment
Beijing University of Aeronautics and Astronautics
Beijing, 100083, China
E-mail: liwei@nlsde.buaa.edu.cn
Program Chair:
Atsushi Ohori
School of Information Science
Japan Advanced Institute of Science and Technology
Tatsunokuchi, Ishikawa, 923-1292 JAPAN
E-mail: ohori@jaist.ac.jp
Tel: +81 761 51-1275, Fax: +81 761 51-1149
Program committee:
Manuel Chakravarty (University of New South Wales, Australia)
Wei Ngan Chin (National University of Singapore, Singapore)
Tyng-Ruey Chuang (Academia Sinica, Taiwan)
Yuxi Fu (Shanghai Jiaotong University, China)
Masahito Hasegawa (Kyoto University, Japan)
Kohei Honda (Queen Mary College, UK)
Zhenjiang Hu (University of Tokyo, Japan)
Peter Lee (Carnegie Mellon University, USA)
Shilong Ma (Beijing University of Aeronautics and Astronautics, China)
Martin Odersky (Ecole Polytechnique de Lausanne, Switzerland)
Atsushi Ohori (JAIST, Japan), Chair
Don Sannella (University of Edinburgh, UK)
Zhong Shao (Yale University, USA)
Kwangkeun Yi (KAIST, Korea)
Taiichi Yuasa (Kyoto University, Japan)
Important dates:
Submission deadline: May 27, 2003
Notification of acceptance: July 17, 2003
Final paper due: August 17, 2003
Symposium: October 27-29, 2003
Proceedings and Submission Procedure
The proceedings will be published in the Springer-Verlag Lecture Notes in Computer
Science series. Final papers will be no more than 15 pages long in the format
specified by Springer-Verlag
Prospective authors are invited to submit full papers in English presenting
original research. Submitted papers must be unpublished and not submitted for
publication elsewhere.
Papers must be submitted in either PDF format, or as PostScript documents that are interpretable by Ghostscript. The submission page will be open at http://www.jaist.ac.jp/aplas/submit/ before the deadline. Those who have difficulty in web-based submission should contact the program chair.
It is recommended that submissions adhere to the format and length of the proceedings described above. Submissions that are clearly too long may be rejected immediately.
********************************************[社团信息]
--------------------------------------------
From:"Ben-Chang(ADSL)" <statben.shia@msa.hinet.net>
Sent: 2003-04-14 16:56:33.0
Subject: CDMS-Newsletter 第8期--WWW.CDMS.org.tw
http://bbs1.nju.edu.cn/file/CDMS-Newsletter920414.doc
中華資料採礦協會(DATA MINING society)
2003 CDMS-Newsletter 第8期目 錄
理事長的話... 1
本期主題:資料採礦在行銷策略上之應用 -以電信業為例... 3
CDMS理監事簡介... 18
社務報告... 20
附錄—§「Data Mining 認證學分班」§ 招生簡章... 22
********************************************
[问题讨论]
--------------------------------------------
From: raingod@sjtu.edu.cn
Sent: 2003-04-14 21:06:12.0
Subject: 关于数据预处理~`
mlchina2,您好!
当数据有缺损时,需要在数据预处理时进行必要的修补。我想知道有些什么修补的方法?
如果是取属性中出现频率较多的值作为修补值,那么这一出现频率多少是在整体样本中统计,还是
在分好类的子样本中进行统计?本人是想进行决策树方面的尝试。
From: zlx@s1000e.cs.tsinghua.edu.cn
Sent: 2003-04-14 21:44:55.0
Subject: Re:关于数据预处理~`
如果我没记错的话
C4.5决策树程序里没有对缺损的数据进行修补
在构造决策树中,计算information gain都是基于有值的样本进行统计(某属性没值的instance在计算该属性的information gain时不予考虑)
在分类时,属性值未知的样本 在遇到对该属性进行判别的结点时,将同时被分到决策树两个分支里,最后根据最后决策权值决定到底属于哪类
比如,决策树在node的规则为 if A>a1 分支T1
else 分支T2
那么对于在属性A上缺值的样本将同时被分到分支T1和T2中去
不知道直接对未知属性预处理填补全了,分类效果如何。
From:
Sent: 2003-04-18 21:50:49.0
Subject: Re:Re:关于数据预处理~`
张丽新,您好!
1。如果两百个样本中,某些属性缺失数值达到将近100个呢?这样使用决策树还有意义嘛?
2。你局的那个关于同时分配到两个分支的例子我看不懂,我觉得有缺失的样本还是应该分到分支二呀?
3。发现自己对决策树的具体操作还是有理解上的缺乏,希望推荐几篇文章或者书籍能让我更详细的理解决策树算法。
谢谢:)
--------------------------------------------
From: winds@qingdaonews.com
Sent: 2003年4月14日 21:00
Subject: 最小二乘线性拟合算法程序
大家好, 不知有谁了解用Matlab进行
最小二乘线性拟合的操作? 即求矩阵B和向量C,
使对于一组给定的向量对(Xi,Yi),i=1,2...M,
以下线性拟合式: Y=BX+C的平方误差最小, 能告诉我具体
方法么? 有源程序么?
谢谢!
Blue Eyes
2003.4.13
From: "Jie Bao" <baojie@cs.iastate.edu>
Sent: 2003-04-14 21:41:29.0
Subject: Re: 最小二乘线性拟合算法程序
x=[0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1];
y=[-.447 1.978 3.28 6.16 7.08 7.34 7.66 9.56 9.48 9.30 11.2]
n=1; %多项式阶数
p=polyfit(x,y,n)
From:"薛峰" <xuef@genomics.org.cn>
Sent: 2003-04-15 08:23:04.0
Subject: Re:最小二乘线性拟合算法程序
x=[1,2,3,4,5,6,7,8,9,];
y=[1,2,3,4,5,6,7,8,9];
x=[ones(9,1);x];
regress(y,x);
return b0,b1;
-------------------------------------------
From: zj@pact518.hit.edu.cn
Sent: 2003-04-15 14:55:44.0
Subject: 海量数据的分类问题
请教大家个问题:
分类问题:数据集很大,每项数据(instance)的特征大约为30~40,或者更少些,而分类就是为了数据判断正常异常,两类数据分类问题。请问在处理大数据量分类问题的时候,那个算法比较合适,要求分类模型算法复杂度不能太高,还要考虑近实时性问题。支持向量机在这样的情况下怎么样?除此之外,还是什么好的分类算法模型?
zheng jun
From: "郑建华" <longzhanyuye@eyou.com>
Sent: 2003-04-15 23:00:03.0
Subject: Re: 海量数据的分类问题
数据集有多大,几千、几万、几十万还是更多?
“分类就是为了数据判断正常异常”,正常数据和异常数据的比例如何,异常数据是否很
少?
特征有30~40个,可否考虑先做特征提取?
如果数据量很大而实时性要求又很重要,SVM似乎不太适合,不过也可以考虑一些优化,
比如SVM的一些变种、SVM的增量学习、用Simplified Support Vector Decision Rules减
少决策时间,等等。
其他的分类算法,决策树的SLIQ算法处理几百万数据应当没问题吧。
From: zj@pact518.hit.edu.cn
Sent: 2003-04-16 19:39:11.0
Subject: Re: 海量数据的分类问题
我的数据集上百万,异常数据比例较小,决策数和Simplified Support Vector Decision Rules都是很好的选择方案。我们现在的数据好像还不要考虑数据特征提取的问题,因为数据特征或者跟确定的说是instance里的attribute已经很明显,就用他们作分类的特征。好像不用考虑专门的特征提取的算法,目前的41个特征好像细分析后还可以精简,把一些线性相关的省去。
From: "郑建华" <longzhanyuye@eyou.com>
Sent: 2003-04-17 16:10:20.0
Subject: Re: 海量数据的分类问题
SVM对这么大的数据量,我没有经验,不过训练时间估计比较长,SV的绝对数量估计也不
少,所以决策的实时性也不好说...
我没有经验,还是让有经验的大虾们指点指点吧
--------------------------------------------
From:"Weiqi Wei" <wqwei@bjent.com>
Sent: 2003-04-15 19:04:35.0
Subject: ER vs EAV
最近还在忙着写我的综述,遇到了这样一个问题;
我在学习数据库系统概念的时候只重点学习了ER模型,但是现在在接触文献的时候发现很多在生物医学数据库方面的工作使用的都是EAV(实体-属性-值)模型,我找了很多的教科书也没有提到EAV的模型,到GOOGLE上面也没有找到相关的文献,实在头疼。
我个人理解ER和EAV确实应该是不同的模型,EAV更像我学过的关系模型,但是实在不好意思,我的数据库方面知识实在浅薄,所以只能向高手求教:)
1.ER和EAV是什么关系。各是谁提出来的?
2.EAV和ER各有什么长处?为什么没有什么人提到EAV,但是在医学数据库的使用中却很是普及呢?
3.如果有好的相关文献,希望可以帮我一把还望不吝赐教!!1
多谢:0
另外,在文献中间的heterogeneity到底是个什么定义呢?
是说数据(存在图象、文本和数值的)异质呢 还是(不同数据库之间的)异源呢?
我在文献上,这两个说法都见过,我很是迷惑???
盼望解惑!!
================================
Wei Weiqi
Institute of Basic Medical Sciences
Peking Union Medical College
Chinese Academy of Medical Sciences
Dept. Biomecial Engineering
DongDanSanTiao 5
Code:100005
Lab:86-10-65296438
==============================
----------------------------------------------
From:"baishilei" <baisl@mail.iim.ac.cn>我是做数据挖掘的。目前数据挖掘好像大多还侧重在交易数据库中进行(例如
经典的Apriori系列算法),而真正在关系数据库中进行数据挖掘的研究不多。我
在考虑针对关系数据库,或者演绎数据库研究特定的挖掘策略,不知可不可行?请
大家给点建议。
另外,我对演绎数据库不是很了解(理论、研究进展等),哪位能够给点指导或
提供一些资料?
谢谢!
From:maoql@nlsde.buaa.edu.cn
Sent: 2003-04-16 11:11:01.0
Subject: Re: 演绎数据库中数据挖掘??
交易数据库是什么,我们怎么都是在关系数据库中做挖掘的呢?
我以前认为在什么样的数据库中做挖掘,跟挖掘算法本身的关系并不大。
From:"Zhang Yang" <zhangy@co-think.com>
Sent: 2003-04-16 14:13:40.0
Subject: Re: 演绎数据库中数据挖掘??
数据挖掘是在任何形式的数据上都可以进行的。
在结构话的数据上,比如在关系数据库上(这个时候,许多研究将关系数据库中的一个元组称为一个交易,英文是transaction,并不是表示是交易数据库)
在半结构化数据上,比如WEB MINING.
在无结构化数据上,比如text mining, image mining, video mining
I hope this information is helpful.
----------------------------------------------
From:"Zhang Yang" <zhangy@co-think.com>
Sent: 2003-04-16 21:34:35.0
Subject: SVM中的支持向量
各位好,
在SVM分类器中,线性可分模型中,支持向量是满足
yi(WXi+b)=1
的向量Xi
在线性不可分模型中,支持向量是什么样的向量呢?
是不是满足:
yi(WXi+b)=1-KESIi
的向量Xi,其中KESIi>=0 ?
Thank you very much.
zhang Yang
From:"郑建华" <longzhanyuye@eyou.com>
Sent: 2003-04-17 09:34:57.0
Subject: Re: SVM中的支持向量
支持向量好像有两种说法,一种是yi(WXi+b)=1,一种是ai>0,我觉得这两种说法在某些
特殊情况下是不等价的(最近在某篇文章中好像看到过,记不清了)。就是说,在某些特
殊情况下,当
yi(WXi+b)=1的时候,ai也是可能等于0的。
我觉得这两种说法都有道理,从决策规则的构造来看,WXi+b=0,W=sum(yi*ai*xi),只
有ai>0的才起作用;从显示数据本质的角度看,似乎yi(WXi+b)=1上的点都应算。
对于线性不可分情形,介于两个超平面yi(WXi+b)=1之间的点其乘子ai=C,也是SV,但这
不一定是错分情形,只有当epsilon>1时才时错分的。
不知道我理解的对不对,请前辈们批评指正。
----------------------------------------------
From:"陈风" <chaidaxia@163.com>
Sent: 2003-04-17 20:08:46.0
Subject: 关于决策树的问题
各位老师好: 决策树算法是目前较为流行的一种分类算法,但是由于训练集过小,id3决策树往往会出现规则不完全的现象,即测试样本无法在任何一条路径上完整测试。这是应该如何改进呢?好像模糊决策树能够处理这个问题,又知道的老师请给于解答,请各位指教!
----------------------------------------------
From:姜 伟<crazymaths@yahoo.com.cn>
Sent: 2003-04-17 21:01:56.0
Subject: 请教有关贝叶斯网络的问题
From:"jia" <jia@email.jlu.edu.cn>
Sent: 2003-04-18 12:20:59.0
Subject: Re: 请教有关贝叶斯网络的问题
我也正在学习,大家讨论一下。
有一篇文章:
Heckerman D, Wellman Michael. Bayesian Network. communications of ACM,
9538(3): 27~30
写得不错,可惜我这只有其中的这一部分。谁有全部内容的?
WWW.AUAI.ORG 也是个不错的网站
From:"丁弋" <dequatorem@eyou.com>
Sent: 2003-04-18 19:48:46.0
Subject: Bayesian Networks
Nilsson的《人工智能》这本书里也有部分Bayesian Networks的介绍。
希望对Bayesian Networks有所研究的同仁今后多交流!!
http://www.cs.berkeley.edu/~russell/ai.html#uncertainty
From:姜 伟<crazymaths@yahoo.com.cn>
Sent: 2003-04-20 19:58:47.0
Subject: Re: 请教有关贝叶斯网络的问题
我也有一篇比较好的Paper,只是自己有些东西还是看不明白,还希望大家多多指导,交流。见
----------------------------------------------
From:"Zhang Yang" <zhangy@co-think.com>
Sent: 2003-04-19 05:24:11.0
Subject: 中文文本分类的语料库
----------------------------------------------
From:姜
伟<crazymaths@yahoo.com.cn>
Sent: 2003-04-23 09:07:51.0
Subject: 贝叶斯网络的工具?
据说现有的贝叶斯网络的工具很多,请高手推荐一二!
From:"Jie Bao" <baojie@cs.iastate.edu>
Sent: 2003-04-24 05:49:18.0
Subject: Re:贝叶斯网络的工具?
BNJ, Bayesian Network tools in Java, in KSU, include Influence Diagram
function, http://bndev.sourceforge.net/ ( with API)
Bayes Net Toolbox for Matlab, Kevin Murphy, supports influence diagrams
as well as Bayes nets.
http://www.ai.mit.edu/~murphyk/Software/BNT/bnt.html or
http://groups.yahoo.com/group/BayesNetToolbox ( with API)
Netica, for both Bayesian belief network and influence diagram,
http://www.norsys.com/
HUGIN graphic system for developing and manipulating Bayesian networks
and influence diagrams. The one limitation is that it does not accept
networks with more than 200 states http://www.hugin.dk ( with API)
GeNIe (Grapichal Network Interface), building and reasoning about BN's
and influence diagrams in Win 95/NT environments,
http://www2.sis.pitt.edu/~genie
Microsoft Bayes Networks (MSBN) , Restricted to single decision
influence diagram. http://www.research.microsoft.com/adapt/MSBNx/ (
with API)
XBAIES , http://www.staff.city.ac.uk/~rgc/webpages/xbpage.html
JavaBayes, http://www-2.cs.cmu.edu/~javabayes/Home/ ( with API)
See Murphy's list:
http://www.ai.mit.edu/~murphyk/Software/BNT/bnsoft.html
返回目录
----------------------------------------------
From:公子 璧<gongzibi@yahoo.com.cn>
Sent: 2003-04-23 19:31:54.0
Subject: 请问:库仑模型
我记得在某处看到过有关库仑模型的描述和应用,但是现在想不起来了。请问有谁知道库仑模型应用于哪些方面和它的具体描述吗?
多谢!
From:"vincent" <vincent@Comp.HKBU.Edu.HK>
Sent: 2003-04-24 20:37:06.0
Subject: Re: 请问:库仑模型
最近在看一篇用库仑模型解释SVM的文章, 很新颖:
Coulomb Classifiers: Generalizing Support Vector Machines via an
Analogy to Electrostatic Systems
http://citeseer.nj.nec.com/528957.html
Sent: 2003-04-26 11:50:56.0
Subject: Re: 请问:库仑模型
在我看来,库仑模型是指用静电系统解释/模拟要解决的问题,通常
会牵涉到库仑定律(Coulomb Law). 你搜索coulomb energy, electrostatic system应
该有许多结果。
From:zhouzh@nju.edu.cn
Sent: 2003-04-26 13:59:13.0
Subject: Re: 请问:库仑模型
库仑势模型(Coulomb Potential Model),是域理论(Field Theory)或称电场理论的一个实现。
返回目录
----------------------------------------------
From:raingod@sjtu.edu.cn
Sent: 2003-04-28 14:07:39.0
Subject: *.data,*.test,*.names这样类型的文件是怎样生成的?
看到网上一些关于数据挖掘的数据是以这样的格式存储的,有谁知道它们是如何生成的吗?
或者说查一下什么书可以对它们的格式有比较细致的理解?谢谢~~
From:"Jie Bao" <baojie@cs.iastate.edu>
Sent: 2003-04-24 20:37:06.0
Subject: Re: *.data,*.test,*.names这样类型的文件是怎样生成的?
http://www.rulequest.com/see5-unix.html#.names
More information is available via the UC-Irvine archive of machine learning
datasets http://www.ics.uci.edu/~mlearn/MLRepository.html
返回目录
*******************************************************************************
********************* 结尾 第3卷 第5期 2003年5月 ****************************
*******************************************************************************
* <中国机器学习邮件列表>如果您不希望继续收到该列表的邮件,请到下述地址退订****
********
http://www.pami.sjtu.edu.cn/people/gzli/resource.htm#mlchina *********
*******************************************************************************