Home Backend Development C#.Net Tutorial Three ways to read Excel in C# and comparative analysis

Three ways to read Excel in C# and comparative analysis

Jan 13, 2017 pm 05:10 PM

(1) OleDB method

Advantages: Treat Excel directly as a data source, read the content directly through SQL, and the reading speed is faster.

Disadvantages: The method of reading data is not flexible enough and cannot directly read a certain cell. Only after reading the entire Sheet page (the result is Datatable) can the specified value be obtained in the Datatable based on the number of rows and columns. .

When the amount of Excel data is large. It will occupy a lot of memory, and when there is not enough memory, a memory overflow exception will be thrown.

The reading code is as follows:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

public DataTable GetExcelTableByOleDB(string strExcelPath, string tableName)

{

  try

  {

    DataTable dtExcel = new DataTable();

    //数据表

    DataSet ds = new DataSet();

    //获取文件扩展名

    string strExtension = System.IO.Path.GetExtension(strExcelPath);

    string strFileName = System.IO.Path.GetFileName(strExcelPath);

    //Excel的连接

    OleDbConnection objConn = null;

    switch (strExtension)

    {

      case ".xls":

        objConn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 8.0;HDR=NO;IMEX=1;\"");

        break;

      case ".xlsx":

        objConn = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 12.0;HDR=NO;IMEX=1;\"");

        break;

      default:

        objConn = null;

        break;

    }

    if (objConn == null)

    {

      return null;

    }

    objConn.Open();

    //获取Excel中所有Sheet表的信息

    //System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null);

    //获取Excel的第一个Sheet表名

    //string tableName = schemaTable.Rows[0][2].ToString().Trim();

    string strSql = "select * from [" + tableName + "]";

    //获取Excel指定Sheet表中的信息

    OleDbCommand objCmd = new OleDbCommand(strSql, objConn);

    OleDbDataAdapter myData = new OleDbDataAdapter(strSql, objConn);

    myData.Fill(ds, tableName);//填充数据

    objConn.Close();

    //dtExcel即为excel文件中指定表中存储的信息

    dtExcel = ds.Tables[tableName];

    return dtExcel;

  }

  catch

  {

    return null;

  }

}

Copy after login

The following explains the connection string

HDR=Yes, which represents the first Rows are titles and are not used as data (but in my actual use, if there are complex values ​​in the first row, the read Datatable column titles will automatically be set to F1, F2, etc., which is inconsistent with actual applications, so At that time, all the content was read into the Datatable through HDR=No, and then the first row was manually set as the title); IMEX (IMport EXport mode) setting
IMEX has three modes:
0 is Export mode
1 is Import mode
2 is Linked mode (full update capabilities)
What I particularly want to explain here is the IMEX parameter, because different modes represent different reading and writing behaviors:
When IMEX =0 is the "export mode". The Excel file opened in this mode can only be used for "writing" purposes.
When IMEX=1, it is "Import mode". The Excel file opened in this mode can only be used for "reading" purposes.
When IMEX=2, it is "link mode". The Excel file opened in this mode can support both "reading" and "writing" purposes.

--------------------------------

In addition, read Excel2007 version of the file, the version should be changed from 8.0 to 12.0. At the same time, the driver can no longer use Jet, but should use ACE. Responsible for causing "No installable ISAM found" error.

--------------------------------

Also found on the Internet using In this method, the number of Sheet tables retrieved may be greater than the number of Sheet tables in the actual Excel table. There are two reasons for this:

1. The extracted names include those in the XL Naming Manager name (see the formula of XL2007 - Naming Manager, shortcut key Crtl+F3);

2. The extracted name includes the FilterDatabase suffix, which is used by XL to record the Filter range.

The first point is relatively simple, just delete the content in the existing naming manager; the second point is more troublesome to deal with. These names will still be retained after the Filter is deleted. The simple way is to add a new Sheet and then Copy the original Sheet into it. However, the actual situation cannot do the above checks for every Excel. The filtering scheme is given below. (We have verified this problem, please verify it yourself)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

//objConn为读取Excel的链接,下面通过过滤来获取有效的Sheet页名称集合

 System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null);

 List<string> lstSheetNames = new List<string>();

 for (int i = 0; i < schemaTable.Rows.Count; i++)

 {

   string strSheetName = (string)dtSheetName.Rows[i]["TABLE_NAME"];

   if (strSheetName.Contains("$") && !strSheetName.Replace("&#39;", "").EndsWith("$"))

   {

     //过滤无效SheetName完毕....

     continue;

   }

   if (lstSheetNames != null && !lstSheetNames.Contains(strSheetName))

     lstSheetNames.Add(strSheetName);

 }

Copy after login

Because the read out SheetName is invalid and the last character is generally not $. If the SheetName has some special symbols, the read SheetName will be automatically added with single quotes. For example, if you edit SheetName to MySheet(1) in Excel, the read SheetName will be: 'MySheet(1)$', so it is best to filter the single quotes before judging whether the last character is $.

--------------------------------

(2) Com component method (implemented by adding Microsoft.Office.Interop.Excel reference)

Advantages: It can read data in Excel very flexibly, and users can flexibly call various functions for processing.

Disadvantages: Cell-based processing has a slow reading speed. It is best not to use this method to read files with large amounts of data.

You need to add the corresponding DLL reference. This reference must exist before it can be used. If the Web site is deployed on IIS, Excel needs to be installed on the server machine, and sometimes IIS permissions need to be configured.

The reading code is as follows:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

private Stopwatch wath = new Stopwatch();

/// <summary>

/// 使用COM读取Excel

/// </summary>

/// <param name="excelFilePath">路径</param>

/// <returns>DataTabel</returns>

public System.Data.DataTable GetExcelData(string excelFilePath)

{

  Excel.Application app = new Excel.Application();

  Excel.Sheets sheets;

  Excel.Workbook workbook = null;

  object oMissiong = System.Reflection.Missing.Value;

  System.Data.DataTable dt = new System.Data.DataTable();

  wath.Start();

  try

  {

    if (app == null)

    {

      return null;

    }

    workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong,

      oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong);

    //将数据读入到DataTable中——Start

    sheets = workbook.Worksheets;

    Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);//读取第一张表

    if (worksheet == null)

      return null;

    string cellContent;

    int iRowCount = worksheet.UsedRange.Rows.Count;

    int iColCount = worksheet.UsedRange.Columns.Count;

    Excel.Range range;

    //负责列头Start

    DataColumn dc;

    int ColumnID = 1;

    range = (Excel.Range)worksheet.Cells[1, 1];

    while (range.Text.ToString().Trim() != "")

    {

      dc = new DataColumn();

      dc.DataType = System.Type.GetType("System.String");

      dc.ColumnName = range.Text.ToString().Trim();

      dt.Columns.Add(dc);

  

      range = (Excel.Range)worksheet.Cells[1, ++ColumnID];

    }

    //End

    for (int iRow = 2; iRow <= iRowCount; iRow++)

    {

      DataRow dr = dt.NewRow();

      for (int iCol = 1; iCol <= iColCount; iCol++)

      {

        range = (Excel.Range)worksheet.Cells[iRow, iCol];

        cellContent = (range.Value2 == null) ? "" : range.Text.ToString();

          dr[iCol - 1] = cellContent;

      }

      dt.Rows.Add(dr);

    }

    wath.Stop();

    TimeSpan ts = wath.Elapsed;

    //将数据读入到DataTable中——End

    return dt;

  }

  catch

  {

    return null;

  }

  finally

  {

    workbook.Close(false, oMissiong, oMissiong);

    System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook);

    workbook = null;

    app.Workbooks.Close();

    app.Quit();

    System.Runtime.InteropServices.Marshal.ReleaseComObject(app);

    app = null;

    GC.Collect();

    GC.WaitForPendingFinalizers();

  }

}

/// <summary>

/// 使用COM,多线程读取Excel(1 主线程、4 副线程)

/// </summary>

/// <param name="excelFilePath">路径</param>

/// <returns>DataTabel</returns>

public System.Data.DataTable ThreadReadExcel(string excelFilePath)

{

  Excel.Application app = new Excel.Application();

  Excel.Sheets sheets = null;

  Excel.Workbook workbook = null;

  object oMissiong = System.Reflection.Missing.Value;

  System.Data.DataTable dt = new System.Data.DataTable();

  wath.Start();

  try

  {

    if (app == null)

    {

      return null;

    }

    workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong,

      oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong);

    //将数据读入到DataTable中——Start

    sheets = workbook.Worksheets;

    Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);//读取第一张表

    if (worksheet == null)

      return null;

    string cellContent;

    int iRowCount = worksheet.UsedRange.Rows.Count;

    int iColCount = worksheet.UsedRange.Columns.Count;

    Excel.Range range;

    //负责列头Start

    DataColumn dc;

    int ColumnID = 1;

    range = (Excel.Range)worksheet.Cells[1, 1];

    while (iColCount >= ColumnID)

    {

      dc = new DataColumn();

      dc.DataType = System.Type.GetType("System.String");

      string strNewColumnName = range.Text.ToString().Trim();

      if (strNewColumnName.Length == 0) strNewColumnName = "_1";

      //判断列名是否重复

      for (int i = 1; i < ColumnID; i++)

      {

        if (dt.Columns[i - 1].ColumnName == strNewColumnName)

          strNewColumnName = strNewColumnName + "_1";

      }

      dc.ColumnName = strNewColumnName;

      dt.Columns.Add(dc);

      range = (Excel.Range)worksheet.Cells[1, ++ColumnID];

    }

    //End

    //数据大于500条,使用多进程进行读取数据

    if (iRowCount - 1 > 500)

    {

      //开始多线程读取数据

      //新建线程

      int b2 = (iRowCount - 1) / 10;

      DataTable dt1 = new DataTable("dt1");

      dt1 = dt.Clone();

      SheetOptions sheet1thread = new SheetOptions(worksheet, iColCount, 2, b2 + 1, dt1);

      Thread othread1 = new Thread(new ThreadStart(sheet1thread.SheetToDataTable));

      othread1.Start();

      //阻塞 1 毫秒,保证第一个读取 dt1

      Thread.Sleep(1);

      DataTable dt2 = new DataTable("dt2");

      dt2 = dt.Clone();

      SheetOptions sheet2thread = new SheetOptions(worksheet, iColCount, b2 + 2, b2 * 2 + 1, dt2);

      Thread othread2 = new Thread(new ThreadStart(sheet2thread.SheetToDataTable));

      othread2.Start();

      DataTable dt3 = new DataTable("dt3");

      dt3 = dt.Clone();

      SheetOptions sheet3thread = new SheetOptions(worksheet, iColCount, b2 * 2 + 2, b2 * 3 + 1, dt3);

      Thread othread3 = new Thread(new ThreadStart(sheet3thread.SheetToDataTable));

      othread3.Start();

      DataTable dt4 = new DataTable("dt4");

      dt4 = dt.Clone();

      SheetOptions sheet4thread = new SheetOptions(worksheet, iColCount, b2 * 3 + 2, b2 * 4 + 1, dt4);

      Thread othread4 = new Thread(new ThreadStart(sheet4thread.SheetToDataTable));

      othread4.Start();

      //主线程读取剩余数据

      for (int iRow = b2 * 4 + 2; iRow <= iRowCount; iRow++)

      {

        DataRow dr = dt.NewRow();

        for (int iCol = 1; iCol <= iColCount; iCol++)

        {

          range = (Excel.Range)worksheet.Cells[iRow, iCol];

          cellContent = (range.Value2 == null) ? "" : range.Text.ToString();

          dr[iCol - 1] = cellContent;

        }

        dt.Rows.Add(dr);

      }

      othread1.Join();

      othread2.Join();

      othread3.Join();

      othread4.Join();

      //将多个线程读取出来的数据追加至 dt1 后面

      foreach (DataRow dr in dt.Rows)

        dt1.Rows.Add(dr.ItemArray);

      dt.Clear();

      dt.Dispose();

      foreach (DataRow dr in dt2.Rows)

        dt1.Rows.Add(dr.ItemArray);

      dt2.Clear();

      dt2.Dispose();

      foreach (DataRow dr in dt3.Rows)

        dt1.Rows.Add(dr.ItemArray);

      dt3.Clear();

      dt3.Dispose();

      foreach (DataRow dr in dt4.Rows)

        dt1.Rows.Add(dr.ItemArray);

      dt4.Clear();

      dt4.Dispose();

      return dt1;

    }

    else

    {

      for (int iRow = 2; iRow <= iRowCount; iRow++)

      {

        DataRow dr = dt.NewRow();

        for (int iCol = 1; iCol <= iColCount; iCol++)

        {

          range = (Excel.Range)worksheet.Cells[iRow, iCol];

          cellContent = (range.Value2 == null) ? "" : range.Text.ToString();

          dr[iCol - 1] = cellContent;

        }

        dt.Rows.Add(dr);

      }

    }

    wath.Stop();

    TimeSpan ts = wath.Elapsed;

    //将数据读入到DataTable中——End

    return dt;

  }

  catch

  {

    return null;

  }

  finally

  {

    workbook.Close(false, oMissiong, oMissiong);

    System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook);

    System.Runtime.InteropServices.Marshal.ReleaseComObject(sheets);

    workbook = null;

    app.Workbooks.Close();

    app.Quit();

    System.Runtime.InteropServices.Marshal.ReleaseComObject(app);

    app = null;

    GC.Collect();

    GC.WaitForPendingFinalizers();

  }

}

Copy after login

(3) NPOI method to read Excel (this method has not been tested)

NPOI is the .NET version of the POI project. POI is an open source Java project for reading and writing Excel, WORD and other Microsoft OLE2 component documents. Using NPOI, you can read and write WORD/EXCEL documents on a machine that does not have Office installed or a corresponding environment.

Advantages: Reading Excel is faster, and the reading method is flexible.

Disadvantages: The corresponding plug-in needs to be downloaded and added to the system reference.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

/// <summary>

/// 将excel中的数据导入到DataTable中

/// </summary>

/// <param name="sheetName">excel工作薄sheet的名称</param>

/// <param name="isFirstRowColumn">第一行是否是DataTable的列名</param>

/// <returns>返回的DataTable</returns>

public DataTable ExcelToDataTable(string sheetName, bool isFirstRowColumn)

{

  ISheet sheet = null;

  DataTable data = new DataTable();

  int startRow = 0;

  try

  {

    fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);

    if (fileName.IndexOf(".xlsx") > 0) // 2007版本

      workbook = new XSSFWorkbook(fs);

    else if (fileName.IndexOf(".xls") > 0) // 2003版本

      workbook = new HSSFWorkbook(fs);

    if (sheetName != null)

    {

      sheet = workbook.GetSheet(sheetName);

    }

    else

    {

      sheet = workbook.GetSheetAt(0);

    }

    if (sheet != null)

    {

      IRow firstRow = sheet.GetRow(0);

      int cellCount = firstRow.LastCellNum; //一行最后一个cell的编号 即总的列数

      if (isFirstRowColumn)

      {

        for (int i = firstRow.FirstCellNum; i < cellCount; ++i)

        {

          DataColumn column = new DataColumn(firstRow.GetCell(i).StringCellValue);

          data.Columns.Add(column);

        }

        startRow = sheet.FirstRowNum + 1;

      }

      else

      {

        startRow = sheet.FirstRowNum;

      }

      //最后一列的标号

      int rowCount = sheet.LastRowNum;

      for (int i = startRow; i <= rowCount; ++i)

      {

        IRow row = sheet.GetRow(i);

        if (row == null) continue; //没有数据的行默认是null       

          

        DataRow dataRow = data.NewRow();

        for (int j = row.FirstCellNum; j < cellCount; ++j)

        {

          if (row.GetCell(j) != null) //同理,没有数据的单元格都默认是null

            dataRow[j] = row.GetCell(j).ToString();

        }

        data.Rows.Add(dataRow);

      }

    }

    return data;

  }

  catch (Exception ex)

  {

    Console.WriteLine("Exception: " + ex.Message);

    return null;

  }

}

Copy after login

For more C# three ways to read Excel and comparative analysis related articles, please pay attention to the PHP Chinese website!


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to use various symbols in C language How to use various symbols in C language Apr 03, 2025 pm 04:48 PM

The usage methods of symbols in C language cover arithmetic, assignment, conditions, logic, bit operators, etc. Arithmetic operators are used for basic mathematical operations, assignment operators are used for assignment and addition, subtraction, multiplication and division assignment, condition operators are used for different operations according to conditions, logical operators are used for logical operations, bit operators are used for bit-level operations, and special constants are used to represent null pointers, end-of-file markers, and non-numeric values.

What is the role of char in C strings What is the role of char in C strings Apr 03, 2025 pm 03:15 PM

In C, the char type is used in strings: 1. Store a single character; 2. Use an array to represent a string and end with a null terminator; 3. Operate through a string operation function; 4. Read or output a string from the keyboard.

The difference between multithreading and asynchronous c# The difference between multithreading and asynchronous c# Apr 03, 2025 pm 02:57 PM

The difference between multithreading and asynchronous is that multithreading executes multiple threads at the same time, while asynchronously performs operations without blocking the current thread. Multithreading is used for compute-intensive tasks, while asynchronously is used for user interaction. The advantage of multi-threading is to improve computing performance, while the advantage of asynchronous is to not block UI threads. Choosing multithreading or asynchronous depends on the nature of the task: Computation-intensive tasks use multithreading, tasks that interact with external resources and need to keep UI responsiveness use asynchronous.

How to use char array in C language How to use char array in C language Apr 03, 2025 pm 03:24 PM

The char array stores character sequences in C language and is declared as char array_name[size]. The access element is passed through the subscript operator, and the element ends with the null terminator '\0', which represents the end point of the string. The C language provides a variety of string manipulation functions, such as strlen(), strcpy(), strcat() and strcmp().

How to handle special characters in C language How to handle special characters in C language Apr 03, 2025 pm 03:18 PM

In C language, special characters are processed through escape sequences, such as: \n represents line breaks. \t means tab character. Use escape sequences or character constants to represent special characters, such as char c = '\n'. Note that the backslash needs to be escaped twice. Different platforms and compilers may have different escape sequences, please consult the documentation.

How to convert char in C language How to convert char in C language Apr 03, 2025 pm 03:21 PM

In C language, char type conversion can be directly converted to another type by: casting: using casting characters. Automatic type conversion: When one type of data can accommodate another type of value, the compiler automatically converts it.

The difference between char and wchar_t in C language The difference between char and wchar_t in C language Apr 03, 2025 pm 03:09 PM

In C language, the main difference between char and wchar_t is character encoding: char uses ASCII or extends ASCII, wchar_t uses Unicode; char takes up 1-2 bytes, wchar_t takes up 2-4 bytes; char is suitable for English text, wchar_t is suitable for multilingual text; char is widely supported, wchar_t depends on whether the compiler and operating system support Unicode; char is limited in character range, wchar_t has a larger character range, and special functions are used for arithmetic operations.

What is the function of C language sum? What is the function of C language sum? Apr 03, 2025 pm 02:21 PM

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

See all articles