General thread for quick questions

Dacke

Level 10

Re: General thread for quick questions

« Reply #220 on: September 12, 2015, 07:05:05 PM »

Post the message?


	Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism

ProgramGamer

Administrator
Level 10

aka Mireille

Re: General thread for quick questions

« Reply #221 on: September 12, 2015, 07:21:46 PM »

Code:

fatal: The current branch master has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream www.WebSiteWhereIWantToUploadMyThings.com master

Here's the message


	Logged

[ BAWOB on itch.io | Space Punk Slam Dunk's Devlog ]

Dacke

Level 10

Re: General thread for quick questions

« Reply #222 on: September 12, 2015, 08:33:57 PM »

What command did you use? What does git remote say?

Code:

$ git remote -v

edit: My guess is that you've forgotten to connect your local repository (on your computer) to your bitbucket repository. Did you add bitbucket as a remote repository, as per the bitbucket tutorial?
https://confluence.atlassian.com/bitbucket/create-a-repository-221449521.html

Code:

$ git remote add origin ssh://[email protected]/username/bbreponame.git


« Last Edit: September 13, 2015, 05:00:08 AM by Dacke »	Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #223 on: September 13, 2015, 07:31:04 AM »

wrote this

Code:

import os
import sys

File = open("C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt","r")
line = File.readline()
print(line)

got this

Code:

C:\Python34\python.exe C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 4
    File = open("C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt","r")
               ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Process finished with exit code 1

What's wrong??


	Logged

♥♥♥ Send me a WiiU, hit me on PM ♥♥♥

https://forums.tigsource.com/index.php?topic=48415.new#new progen
https://forums.tigsource.com/index.php?topic=32227.new#new Game art trick
404
https://forums.tigsource.com/index.php?topic=49818.0
https://forums.tigsource.com/index.php?topic=68138.n

Layl

Level 3

professional jerkface

Re: General thread for quick questions

« Reply #224 on: September 13, 2015, 07:33:24 AM »

Quote from: Jimym GIMBERT on September 13, 2015, 07:31:04 AM

What's wrong??

String escaping


	Logged

indie11

Level 2

Re: General thread for quick questions

« Reply #225 on: September 13, 2015, 07:45:47 AM »

Anyone here ever worked on a turn-based multiplayer game in Unity? If so, did you roll own your own system or some 3rd party API?


	Logged

http://thegameswemake.itch.io/

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #226 on: September 13, 2015, 07:52:58 AM »

I solved with randomly stubbling on unrelated stack overflow about another problem, Using "r" as a prefix solve it (string as raw) apparently it's the \u that is a problem and now I have

Code:

import os
import sys

File = open(r"C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt", "r")  # r before "" for raw tesxt
line = File.readline()
print(line)
l = list(File)
for line in File:
    print (line)

which lead to

Code:

Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 8, in <module>
    for line in File:
  File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 4847: character maps to <undefined>

When I comment the line

Code:

l = list(File)

I get a cut off there

Code:

colombia

2009_banja_luka_challenger

bosnia_and_herzegovina

2009_barcelona_open_banco_sabadell

barcelona

2009_barcelona_open_banco_sabadell

spain

2009_bh_telecom_indoors

Looking at the same place in the file

Quote

genoa
2009_asb_classic
auckland
2009_australian_open
victoria/n/australia
2009_bancolombia_open
colombia
2009_banja_luka_challenger
bosnia_and_herzegovina
2009_barcelona_open_banco_sabadell
barcelona
2009_barcelona_open_banco_sabadell
spain
2009_bh_telecom_indoors
bosnia_and_herzegovina
2009_bh_tennis_open_international_cup
brazil
2009_brazilian_grand_prix
autódromo_josé_carlos_pace
2009_brazilian_grand_prix
são_paulo
2009_british_grand_prix
buckinghamshire

Nothing suspect at all, the next line seems very fine ... Huh?


	Logged

Dacke

Level 10

Re: General thread for quick questions

« Reply #227 on: September 13, 2015, 08:10:38 AM »

Did you look at invisible characters? Maybe there is a different kind of whitespace/linebreak at that position?

What encoding is the file? Have you tried specifically setting the right encoding?
http://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character


	Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #228 on: September 13, 2015, 08:26:57 AM »

It's from the former blitz parser I wrote, so if there is a strange keycode, it should be at everyline! Maybe the original data have some strange char? Huh?

I'm looking at your link

The blitzcode in question, super straightforward ...

Code:

Include "blitz 3D test parser.bb" ;http://www.blitzbasic.com/codearcs/codearcs.php?code=161

;http://www.zytrax.com/tech/codes.htm
;HT-09-9-Horizontal Tab



; Set The Graphic Mode 
Graphics 600,300,0,2





; Open the file to Read
filein = ReadFile("C:\Users\user\Desktop\part_00.csv")

file$ = ""
this = 0

currentFolder$ = CurrentDir() + "#1 ConceptNet Relations"

dictionary$ = "\dictionary.txt"
Dictfile = WriteFile (currentFolder + dictionary)


If FileType(currentFolder) <> 2 Then 
	Print "no folder found! - trying to create new folder"
	CreateDir currentFolder 
	Print currentFolder 
	If FileType(currentFolder) <> 2 Then
		Print "Creation failed!!":WaitKey()
	Else 
		Print "creation succeed!":WaitKey()
	EndIf
EndIf

Print


Print "Lines of text read from file " + filein
Print

rtemp$ = "";for list of different relation
rcount = 0






;MAINLOOP: skim through each line parsing and filtering data
While Not Eof( filein ) Or KeyHit (1)

	Read1$ = ReadLine$( filein )	; read a new line
	parse( Read1, Chr(9) )			; parse the line into chunk

	prevcount = count
	count = 0



;strip the non desired data from the chunk, discard useless chunk
	For back.parsereturn=Each parsereturn ;flip through the chunk
	
	;let's try to rip the member after the assertion
		;If count = 1 Then relation$ = back\word 	;store the relation
		If count = 2 Then arg1$ = back\word			;store arguments 1
		If count = 3 Then arg2$ = back\word			;store arguments 1

		count = count +1
	Next	



	If Instr(arg1,"/c/en/") <> 0 And Instr(arg2,"/c/en/") <> 0 Then
		arg1=Replace (arg1,"/c/en/",""):
		arg2=Replace (arg2,"/c/en/","")
		
		ParseArg(arg1,dictfile)
		ParseArg(arg2,dictfile)
	EndIf


	



;visualization control	
	If KeyHit (28) Then
		WaitKey()
		Print "pause"
	EndIf 
	;Print

Wend
;END MAINLOOP





CloseFile (Dictfile)
Print "end " + rcount ;50
CloseFile( filein )
;WaitKey()





;----------------------------------------------------------------



Function ParseArg(arg$, file)
	parse (arg, ",")
	Local count = 0

	For back.parsereturn = Each parsereturn
		count = count + 1
		WriteLine( file , back\word )
	Next

;	writeline (file, element)

;	argcount$ = count
;	Print argcount + " " + arg

End Function

EDIT:

Oh it seems it's about the latin capital or something like that

http://www.i18nqa.com/debug/bug-double-conversion.html
http://www.fileformat.info/info/unicode/char/00cd/index.htm

No No NO


« Last Edit: September 13, 2015, 08:32:08 AM by Jimym GIMBERT »	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #229 on: September 13, 2015, 01:19:28 PM »

Code:

 :panda:File = open(r"C:\Users\user\Documents\#1 ConceptNet Relations\dictionary.txt", "r", encoding="utf8")

Code:

Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 9, in <module>
    print(line)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1ed9' in position 11: character maps to <undefined>


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #230 on: September 13, 2015, 01:35:13 PM »

Okay That's weird, I opened Notepad++ convert the whole file to many encoding, still get an error no mater what, generally at consistent but different break point HALP!


	Logged

Cheesegrater

Level 1

Re: General thread for quick questions

« Reply #231 on: September 13, 2015, 02:04:37 PM »

Its probably not UTF-8. Have you tried latin-1? UTF-16?


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #232 on: September 13, 2015, 02:12:52 PM »

I'm trying everything, I haven't found a list of unicode parameter on python yet

EDIT: Latin-1 DID IT!


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #233 on: September 13, 2015, 02:19:16 PM »

OKAY, now I can print with repr(line) but not directly Sad

better though, just need to kill the \n


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #234 on: September 13, 2015, 08:25:35 PM »

well still don't work:

The goal was to use the "set" to remove all duplicates, but as soon as I move from the repr to the actual set function it crashed because unicode .....

Rsearch show that unicode is a nightmare on python (and in general), I don't know what to do, now I file it as failure. Problem is that 2 140 288 line in the folder, trying them all for duplicate would take n² time using brute force Huh?

Turn out I need a crash course in sorting no way to avoid it now.


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #235 on: September 13, 2015, 09:07:30 PM »

http://textmechanic.com/Big-File-Tool-Remove-Duplicate-Lines.html
Internet has everyting Shocked


	Logged

Dacke

Level 10

Re: General thread for quick questions

« Reply #236 on: September 13, 2015, 10:11:42 PM »

If possible, try to avoid having text files in anything but utf8. Death to latin1.

Using a set for utf8 text works perfectly fine.

I made this file and encoded it in utf8:

Code:

Öñü中華民族日本語
ひらがな平仮名
2009_bh_telecom_indoors
bosnia_and_herzegovina
Öñü中華民族日本語
2009_bh_telecom_indoors

Then I wrote this python3 program:

Code:

# open file
file = open("utf8lines.txt")

# create set
unique_lines = set()

# strip and add all lines in file to set
for line in file:
   unique_lines.add(line.strip())

# print all lines in set
for line in unique_lines:
   print(line)

Which correctly outputs the unique lines:

Code:

Öñü中華民族日本語
ひらがな平仮名
2009_bh_telecom_indoors
bosnia_and_herzegovina


	Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #237 on: September 13, 2015, 10:38:44 PM »

To be frank I'm parsing a text from another text that I didn't build originally (the conceptnet)

I did fucked up some character, which mean I will have miss match for the next step.

Basically I have 5gb of csv data in 5 files, I extracted all the relevant data of the first file, only from english concept, in separate file where the semantic relation is the name of the file and the argument are tab separated on each line. Then I extract all the arguments in a dictionary file, tried to remove duplicate and order them alphabetically in hope to build an index.

Then the next step would have been to replace the argument in the relation file by the concept index in the dictionary file. Then generalize for all concept, in all languages, to extract the remaining data in the 5GB database, by automatically detected other language and cross language concept.

So far it's compromised, at least with my current implementation.

I'll tried your stuff.


	Logged

gimymblert

Level 10

The archivest master, leader of all documents

Re: General thread for quick questions

« Reply #238 on: September 13, 2015, 11:38:54 PM »

nope

Code:

bar_aqueduct
jujubinus_poppei
yunshan_road_station
yves_duval
brinklow
donggyo-dong
fuentelsaz_de_soria
étienne_boulay
Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/hellopython/ParseDictionaryToUnique.py", line 24, in <module>
    print(line)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0144' in position 5: character maps to <undefined>


	Logged

Dacke

Level 10

Re: General thread for quick questions

« Reply #239 on: September 13, 2015, 11:46:29 PM »

You still have to make sure to get the encoding right. But that's true no matter what programming language you use, python isn't better or worse.

edit: This isn't the issue, Boris gets it right in the next post.


« Last Edit: September 13, 2015, 11:53:55 PM by Dacke »	Logged

programming • free software
animal liberation • veganism
anarcho-communism • intersectionality • feminism

Pages: 1 ... 10 11 [12] 13 14 ... 69

« previous next »