2011年12月18日星期日

The Basics of Research Methods

Resume: I do think every serious researcher should first read a basic philosophy book to understand the meaning of their work (Philosophy of science: a very short introduction). In practice, research begins with some literature searching and reviewing (Doing a literature search; Critical thinking for students: learn the skills of critical assessment and effective argument; Doing a literature review or The literature review: six steps to success), based on the past and current literature, one can form his own research project or proposal (Developing effective research proposals), according to the research project which depicts the research problem and data types, one can begin the data collection, either secondary data (Secondary research information sources and methods ), or primary data (Survey, questionnaire, interview and participation). Whatever the source of the data, the data type will determine the analytic tool used, either qualitative (Qualitative_Data_Analysis__A_User_friendly_Guide_for_Social_Scientists; Qualitative_Data_Analysis__An_expanded_Sourcebook_2nd_Edition) or quantitative (Interpreting quantitatie data; Statistics for people who hate statistics; Doing statistics with SPSS; Nonparametric_Statistics_for_Non_Statisticians__A_Step_by_Step_Approach). The results and general theoretical conclusion will then be presented (Writing and presenting research).

Research is to generate new knowledge and to convince readers of the validity of the conclusions.
Categorise the objects, describe the facts, explain the phenomenon, compare constrasting situations, correlate two or more phenomena, predict and control.

Different approachs
historical aims at reestablish and reevaluate the past facts; descriptive based on observation as a means of collecting data; correlation to find causal relations between independent variable and dependent variable; comparative to explore what conditions necessary to cause certain events; experimental attempts to isolate and control the conditon of determinants; simluation by creating models; action research depends mainly on observation and behavioural data (pragmatic purpose); ethnological focuses on how subjects interpret their own behaviour rather than imposing a theory from outside; cultural concerned with the subjects of language and cultural interpretation.

Research philosophy
Metaphysics: idealism and materialism
Epistemology: empiricism (inductive reasoning) and rationalism (deductive reasoning)
Scientific method: identification of a problem - developing a hypothesis inductively from observations - charting their implications by deduction - practical theoretical testing - rejecting or refining it.

The role of human objects and researcher, and the status of social phenomena: positivism (universal rules) and relativism (different interprentations of the world). Postmodernism considers science itself as subject of continual reinvention and change; critical realism suggests concepts and theories about social events are developed on the basis of their observable effects, and interpreted in such a way that they can be understood and acted upon, even if the interpretation is open to revision as understanding grows.

1. Philosophy of science: a very short introduction 

Writing proposal and research project
As we can see, all kinds of research begin with a kind of writing of research project. A research proposal need to explain the nature of the research and its context, and why it is needed; the aim and objectives of the research and how it will be carried out and what the outcomes are likely to be.

Title-Aim-Background and literature review-research problem-methods-expected outcomes-timetable and description of resources required-references

1. Good essay writing: a social sciences guide
2. Essay writing: a student's guide
3. Proposals that work: a guide for planning dissertations and grant proposals
4. Developing effective research proposals
5. Writing and presenting research
6. Writing up qualitative research

Finding research problem

Research problem can be set up by main question and subquestions, which focus on: different aspects; differnt perspectives; different concepts; and different scales.

The research problem can also be stated in an exploratory approach, which constitutes the subject and scope of the exploration.

Hypothetico-dedecutive method is expressed in terms of the testing of a particular hypothesis. A thypothesis needs to be testable, limit the equiry to the interaction of certain factors, and suggest the methods appopriate for collecting, analysing the data. For this reason, the main abstract and conceptual hypothesis is usually broken down into sub-hypothesis in order to be operational.

A set of propositions rather than a hypothesis allows to concentrate on particular relationships between events. The first proposition is a statement of a particular situation, which is then followed with further propositions that point out factors or events that are related to it.

An argument is to draw conclusions from premises. Genearlly there are only two aims of argumentation. one is to argue for a statement, the other is to refute it. to support a statement, you must prove the truth of the premises, then emply a sound logical progression. to refute a statement, you can challenge the trut of evidence, question the relevance or completness of the evidence, challenge the logic of the argument, or produce counter-examples.

1. Critical thinking for students: learn the skills of critical assessment and effective argument.

Literature Review (15-20 well selected references)
Four major directions:
Research theory and philosophy
History of developments in your subject
Latest research and developments in your subject
Research methods

Critical appraisal:
Relevance: data collection, analysis methods, findings and conclusions
Theoretical assumptions
Logic of the argument
Comparaison between texts

Contents of a review:
Study design and assumptions
Methods and data collection
Analytical methods
Main finding
Conclusions
The study's strengths and limitations

1. Doing a literature search: a comprehensive guide for the social sciences
2. Internet research skills: how to do your literature search and find research information online
3. The literature review: six steps to success

Data nature
Theory (main question)
Concepts (sub-questions)
Indicators (data types)
Variables (data measures)
Values (measurements)

Primary data: measurement, observation, interrogation, participation
Secondary data: check the quality (source, argument, comparaison)

Quantitative data use statistical means as the major analytical tool, while qualitative data require different analytical techniques (maybe critical discourse analysis?)

Different levels of data measurements
Nominal level: distinctive categories (simple graphic and stastical techniques)
Ordinal level: order with regard to a particular property (more statistical techniques)
Interval level: regular scale of some sort
Ratio level: a true zero, and can be expressed in terms of multiples of fractional parts

Secondary data collection
*Locating ana accessing secondary data: data set online; documentary data; libraries and archives; commercial and professional bodies
*Quality check: author and source credibility; methodology and argument; relevance
*Analysis tools (the book clearly prioritize the statistical techiniques and less on descriptive means)
-Content analysis (coding and measuring)
-Data mining
-Meta-analysis

1. Reworking qualitative data: the possibility of secondary analysis
2. Secondary research information sources and methods

Primary data collection
Sampling and case studies (for studies of inter-relational factors)
Population charateristics and sampling techinques

Collection methods
Survey by questionnaires; interviews; observation; participation (grounded theory takes the collecting data to evolve theory rather than to test or refine an exisiting one); experiments; model building.

1. How to sample in surveys
2. Developing a questionnaire-real world research
3. Interviews in qualitative research
4. Using observations in small-scale research: a beginner's guide

Quantitative data analysis 
statistical analysis is often only meaningful when the data of a number of cases is available.

Parametric statistics and non-parametric statistics (nominal or ordinal data)

Parametric tests: descriptive and inferential
Parametric tests: univariate analysis (descriptive); bivariate and multivariate analysis (inferential)

1. Statistics for people who hate statistics
2. Doing statistics with SPSS
3. Starting statistics: a short, clear guide
4. Interpreting quantitatie data
5. Quantitative data analysis for SPSS 12 and 13: a guide for social scientists
6. Nonparametric statistics: an introduction

Qualtitative data analysis
In qualitative research a reciprocal process of data collection and data analysis is an essential part of the project. This type of research is based on data expressed in the form of words rather than on numbers. You will be acting rather like a lawyer presenting a case, using a quasi-judicial approach such as used in an enquiry into a disaster or scnadal.


Qualitative data analysis: data reduction, data display, conclusion and verification.
Data reduction: typologies (classification and coding system); patterns and themes (pattern coding); interim summary.
Data display: matrix (tables); network (maps and charts).

Texts, documents and discourse analysis: interrogative insertion; problem-solution discourse; membership categorization; rhetorical analysis; narrative analysis; semiotics; discourse analysis (the interpretive context of the discourse and the rhetorical organization of the discourse).

1. Real world research: a resource for social scientists and practitioner-researchers.
2. Social research methods
3. An introduction to qualitative research
4. Interpreting qualitative data: methods for analysing talk, text and interaction
5. Doing and writing qualitative research
6. Making sense of qualitative data: complementary research strategies

2011年11月17日星期四

How to Write & Public a Scientific Paper

although the author is a medial scientist and his proposal is basically for the publication of natural science paper, there are still common points for the writing of a scientific paper. His major point is sound and valid: the standard for a science is the possiblity to reproduce the results, if the natural science depends on the clear description of the experiments, the social science counts on the description of the methods used and the data source.

Title: for the use of index in databse, thus delete all the unecessary words such as THE, OF, A, etc.

Abstract: scope of research, methods and materials, results, conclusion, all in one paragraph

Introduction: an extended version of abstract. Attract the attention, literature review if necesary.

Materials and methods: I need to check out how to write this section if it is about social science. For the moment it concerns the data source, description of secondary hand information, and conduction of field work.

Result: descriptive analysis of the data collected.

Discussion and conclusion: analytic analysis, generalization, new contribution to exisiting theories, limits and future orientation.

More interesting part of the book is to understand certain writing protocles from the point of view of a journal editor: REDUCE PRINT COST. Footnotes are concentrated at the end of article, tables and graphs only inserted when inevitabel and designed to fit the space.

The review and publishing process: Submission-Editor-Peer review-Editor decision (reject, modification, accept)-Modification-Managing editor-Proof read-Publication.

Language and presentation suggestion is well known already.

2011年5月25日星期三

Sell yourself

be yourself and win the neutral audience

knowledge, face expression, body language, voice

1. simple phrases and oral language, making it interesting
2. 轻轻扬起眉毛,在前额形成几道平行线。look at nose, mouth, etc.
3. 开放式的手势,两脚分开与肩同宽。坐时脊椎推直,离开椅背,脚后跟着地。
4. 变化的音量,音调,速度。only speak while looking, pause without meaningkess words.

control breath and speak while thinking

PRACTICE!!!

2011年2月28日星期一

Statistics 4: Hypothesis test between groups (categorical variable and numerical variable)

I. INDEPENDENT SAMPLE T TEST

Independent-samples t test assumes the distributions of a (or more) variables in two groups is same. (untold assumption: the internal nature of the group is the independent variable)

condition: two indepedent samples take one test only.

NewImage

t(58)=-1,14, p>0.05

1. df (degree of freedom)=n1+n2-2

2. p value

3. one-tailed or two-tailed test

Effect size (ES) is a measure of the strength of the relationship between two variables in a statistical population, or a sample-based estimate of that quantity. An effect size calculated from data is a descriptive statistic that conveys the estimated magnitude of a relationship without making any statement about whether the apparent relationship in the data reflects a true relationship in the population. In that way, effect sizes complement inferential statistics such as p-values. In certain sense, that means p-value tell you the probability to take the wrong decision to reject or accept the null hypothesis, while effect size or correlation coefficient tells you strength of the relationship between variables.
Cohen's d: (X1-X2)/SD
http://www.uccs.edu/~faculty/lbecker/

II. ANOVA (SIMPLE ANALYSIS OF VARIANCE / ONE WAY ANALYSIS OF VARIANCE)

One-way analysis of variance assumes the distribution of a (or more) variables in more than two groups is same. 

condition: more than two groups take one test (one variable).

NewImage

F(2,27)=8.80, p<0.05

1. df (between)=k-1; df (within)=N-k; df(total)=N-1

2. p value

3. only two tailed test

To determine where is the difference, run post-hoc test with Bonferroni analysis

III. FACTORIAL ANALYSIS OF VARIANCE

Factorial analysis of variance assumes the distribution of a (or more) variables in more than two groups categorized by two variables is same.

Main effect and interaction effect

SPSS use univariate analysis of variance, for it concerns only one explained variable or one dependent variable. 

*if more than two variables, Holy Grail Analysis of Variance

IV. PARED SAMPLE T TEST

Pared sample t test assumes the distribution of a (or more) variable in one group before and after an experiment is same. (untold assumption: the experiment is the independent variable)

condition: only one group takes two tests.

NewImage

t(24)=2.45, p<0.05

1. df=n-1

2. p value

3. one tailed or two tailed test.

 

2011年2月24日星期四

Statistics 2: Inferential statistics concepts

INFERENTIAL STATISTICS

Inferential statistics is used to infer the information about population from the observation of a sample.

Population-Sampling frame-Sampling pool (probability/random sampling or non-probability sampling)-Sample

Estimation

Estimation of numerical data

The principal is that if we can draw all samples from a population, then the distribution of sample means will take a normal curve with the sample means that equal the population means situated in the center. However, we can only have the observation of one sample, and we don't know where exactly located that sample in the distribution of samples, in other words, we don't know whether we have a typical representative sample or not. We then estimate the standard deviation of population from the sd of sample by divided (n-1). That is also why the standard deviation calculation is often divide variance by (n-1) not n. We then have standard deviation of sampling distribution of means by divide sd of population by square root of sample size. According to nature of the normal distribution, we estimate the range of population means from the means of sample with a certain confidence level. Say (sample means-2sd; sample means+2sd) with 95% of confidence level.

Student t shows the adjustment of confidence level when the sample size varies. Basically when the sample size is higher than 30, it respects the normal distribution.

Estimation of categorical data

The principal is the same only we treat category percentage here. The calculation of standard deviation of sampling distribution of percentage will be:

NewImage

Another difference is that the confidence level of the estimation is not only influence by the sample size but also by the category percentage in the population. As there is no student t table for categorical data, when have another benchmark: smaller category percentage of sample * sample size > 1000, if not increase sample size.

Hypothesis testing

A good hypothesis should:

1) reflect the theoretical background and available references

2) short and clear affirmative sentence

3) relationship between variables

4) testable

Inferencial statistics basic concept:
1. Null hypothesis vs. research hypothesis (nondirectional and two tailed test; directional and one tailed test)

Null hypothesis means in the whole population, there is no difference between two samples, or in other words the independent variables doesn't cause the different distribution of dependent variable. the observed variance is due to the coincidence.

2. Normal Curve

Mean=Median; 34.13% of the data is between mean and mean+1 standard deviation (or -1); 13.59% between +1 and +2 sd; 2.15% between +2 and +3 sd, 0.13% more than +3 sd. (Probability: < +1 84%; (+1,+2) 14%; (+2, +3) 2.15%)

3. Standard score (Z)

measure the distance tween a data x and the sd, which can infer the probability of the appearance of a data x.

4. Significance rate: 5%
Z score is to determine whether one event is caused by purely chance or just a result of casual distribution of probability. thus, if one event is hardly to happen under normal condition, that means below 5% (Z>I1.65I), then null hypothesis is wrong.

5. Significance level (p)

6. Degree of freedom (df)


Type I error (α): while null hypothesis true you decide null hypothesis is wrong. and Type II error (β): vice versa
null hypothesis is about the total and can not be verified directly.

Type II error is related to the sample size

I think the most valuable part of today's chapiter is about the design of inferencial statistics. the significance of statistic itself is meaningless. the most important task goes to the analytic work generating the hypothesis about variables. in the practical aspect, the sample construction is also more important then the verification itself. 

How to control the variables and how to choose the appropriate instruments is what I'll continue to learn. but I don't think it will be more difficult and more complicated than the original academic analysis, especially with help of computer software. 

P.108

2011年2月21日星期一

Statistics 1: Desciptive Statistics

DESCRIPTIVE STATISTICS

Statistics is the main quantitative tool used by all social scientists, for this reason I make this series of posts to record my self study outcomes.

Revision of general concepts:

Type of data: Categorical variable (nominal, ordinal); Numerical variable (interval). ARRAY

Mode and Variation ratio for nominal variables; Median and Midspread for ordinal variables; Mean and SD or Median and midspread for interval variables.

Percentage, proportion, ratio and rates of occurrence

Importance of Z score, for different types of data, the main idea should be keep as many information as possible that the original data contain. Thus, for example when we compare the advance that China achieved in GDP and Human development, we should avoid reduce the numerical data to ordinal one, but use standardized Z score to compare the historical development in these two area.

Index construct: Index-Dimensions-Indicators-Variables-Values. To combine different values calculated in different units, we add up its respective rank standings by simply add up or weighted sum. We then have index rankings through index scores.

Central tendency- Mean(arithmetic mean), Weighted mean, Median, Mode
Variability- Standard deviation, Variance, Range, Midspread (interquartile range: upper quartile-lower quartile); Variation ratio

Skewness and Kurtosis: negative skewness when mean is larger than median, positive when mean is smaller than median. SK=3(mean-median)/sd


CORRELATION

Direction (positive or negative); Nature (Forms of line); Strength (How well to predict a dependent variable from knowing independent variable)

Correlation efficient = (Original error - Remaining error)/Original error

Correlation between categorical variables.

Cross tabulation with column percentage while independent variables in column; Mosaic plot graph.

Lambda correlation coefficient: ((Total-total mode)-((Category 1_total-Cateogry 1_mode)+(Category 2_total-Category 2_mode)))/(Total-total mode)).

When the category mode numbers are in the same row, then lambda correlation coefficient is always a zero.

Correlation between numerical variables.

Scatter graph with best-fit line (regression line)

NewImage Y=aX+b, which must pass the point (X,Y) when X=mean of DV and Y=mean of IV.

Coefficient of determination = (Variance of DV-Variance of errors)/(Variance of DV). Error is the difference between observed value and predicted value.

Correlation between categorical and numerical variables.

eta squared coefficient = ((Variance of DV-((Variance of category 1)+(Variance of category 2))/2)/(Variance of DV)

Coefficient of determination is the square of correlation coefficient (pearson r) to demonstrate the percentage of variance of X can be explained by variance of Y. Thus when X and Y is correlated, the variance of X and Y may both explained partly by a common factor Z. There's no implied causality between X and Y.

Choose the proper coefficient according to type of data set.
Pearson correlation score for numerical data
Chi-square for categorical data
Spearman rank correlation coefficient for ordinal variables
Point-biserial correlation coefficient for a categorical variable and a categorical variable
Rank-biserial correlation coefficient for a categorical variable and a ordinal variable

In my paper, cross tabulation with column percentage could be used to describe the difference of distribution of projects between public and private enterprises, while scatter graph could be used to show the impacts of GDP, Natural resources and Institutions on total number of projects. 

-- 发送自我的 iPad

位置:

Research methods of Chinese economy

As an economist trained in Europe, I'm always wondering the appropriate research methods for Chinese economy. I'm not totally against the econometric modeling of conventional economic studies, but I question recently the logic basic of this kind of research. Where come all those hypothesis guiding the mathematic efforts? Yes, all our current works are based on the precedent research. But how about if the precedent works are a series deductive reasoning of a sort of "axiom", like interest maximizing, which is questionable? If the theoretical base is on doubt, where will go the empirical studies? Especially when there exist so many statistical traps and the unreliability of data source in the context of a transitional economy like China. In the end, the science of economy is one branch of social sciences examining the human behavior and interaction. Should we proceed it as a natural science? The simplicity and beauty of mathematic models is attractive but engendering an illusive certainty for the consumers living in an uncertain world. One can argue that the simplification characterized by strict assumptions is methodologically necessary, but what has not been said is the hidden pre assumptions considered as non debatable. Even though this pre assumption, such as self interest maximization, as claimed by mainstream economists, can explain 80% of the reality, the rest 20% unanswered may conceal some facts more important. Thus, I have to agree with my supervisor about the proper reasoning order in social science including economy. That you have to first observe the social facts, describe it as authentically as possible, then analyze it to generate the hypothesis, which need to be verified by other quantitative methods. As you can see, the qualitative study may precede an quantitative study for the latter is just one means to check the former, but not by itself the goal of the scientific research. What I don't agree with my supervisor resides in when should the reference of pertinent works intervene. The searching and reading of precedent works (specific or theoretical), in my eyes, may happen in whenever, either at the beginning of a project or at the end of it, in case that it doesn't manipulate and mislead what you really observe in the field. Thus, I may should describe me as a micro-economist, who favors the inductive reasoning from microlevel, and who consider the macroeconomy as an agglomeration of micro fibers.


-- 发送自我的 iPad

位置:

2011年2月7日星期一

Welcome

Welcome everyone to my first bolg on Chinese Economy. It's basicly the place I'll store and comment the latest news and research articles, but it's also the platform to exchange ideas with all that are interested in the emerging China.